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ABSTRACT.  We  consider  the  formulation  and  local  analysis  of  various  quadratically  convergent 
methods  for  solving  the  symmetric  matrix  inverse  eigenvalue  problem.  One  of  these  methods  ;3 
new.  We  study  the  case  where  multiple  eigenvalues  are  given:  we  show  how  to  state  the  problem 
so  that  it  is  not  overdetermined,  and  describe  how  to  modify  the  numerical  methods  to  retain 
quadratic  convergence  on  the  modified  problem.  We  give  a  general  convergence  analysis  which 
covers  both  the  distinct  and  the  multiple  eigenvalue  cases.  We  also  present  numerical  experiments 
which  illustrate  our  results. 
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§1.   INTRODUCTION 

Let  A{c)  be  the  afRne  family 

n 

A{c)=Ao-h^CkAk,  (1.1) 

where  c  6  R"  and  {Ak}  are  real  symmetric  nxn  matrices.  Denote  the  eigenvalues  of  A{c) 
by  {At(c)}",  where 

Ai(c)  <...  <  A„(c). 

The  following  is  called  an  inverse  eigenvalue  problem: 

Problem   l.    Given  real  numbers  Aj  <  ...  <  A;,  End  c  €  R""  such  that  A,(c)  =  A,', 
i   =    l,...,n. 

There  is  a  large  literature  on  conditions  for  existence  and  uniqueness  of  solutions  to  Prob- 
lem 1  (or  its  variations)  in  majiy  special  cases.  In  this  paper  we  are  concerned  with  the 
formulation  and  local  analysis  of  various  quadratically  convergent  methods  to  solve  the 
problem,  assuming  the  existence  of  a  solution.  Extending  our  techniques  to  give  methods 
with  good  global  behavior  is  an  importamt  task  which  we  shall  not  explicitly  address. 

The  paper  is  organized  as  follows.  In  §1.1  we  discuss  some  of  the  important  motivating 
applications  which  arise  in  the  physical  and  social  sciences.  Many  of  these  lead  to  closely 
related  vaxiations  of  the  model  problem  given  above.  In  §2  we  confine  our  attention  to  the 
case  where  the  given  eigenvalues  {A,'}^  are  distinct,  and  describe  several  numerical  meth- 
ods. Four  of  these  are  related  to  Newton's  method  and  are  generally  locally  quadratically 
convergent.  Of  these  four  methods,  three  are  known  in  the  literature,  and  one  is  apparently 
new.  In  §3  we  discuss  the  case  where  multiple  eigenvalues  are  present  in  the  set  {Aj'}^.  It 
is  well  known  that  the  eigenvalues  are  not  differentiable  functions  at  the  points  where  they 
coalesce.  Nonetheless,  the  behavior  of  the  numerical  methods  in  these  circumstances  has 
received  little  attention.  In  §3.1  we  discuss  the  case  where  the  numerical  methods  of  §2  are 
applied,  without  modifications,  to  problems  with  multiple  eigenvalues.  Assuming  Prob- 
lem 1  has  a  solution,  we  show  that  the  methods  retain  local  quadratic  convergence,  with 


little  or  no  modification,  even  though  the  eigenvalues  are  not  dififerentiable  at  the  solution. 
In  §3.2  we  axgue  that  Problem  1  is  generally  overdetermined  when  multiple  eigenvalues  are 
present,  and  show  how  to  modify  the  problem  so  that  it  has  the  appropriate  number  of 
parameters  and  target  eigenvalues.  We  then  explain  how  to  modify  the  numerical  methods 
of  §2  to  retain  quadratic  convergence  on  the  modified  problem.  In  §3.3  we  give  a  general 
convergence  analysis  which  covers  both  the  distinct  and  the  multiple  eigenvalue  ccises.  In 
§4  we  present  numerical  experiments  which  illustrate  our  results. 

Before  we  proceed  we  must  mention  that  in  many  applications  the  problem  to  be  solved 
is  different  from  Problem  1.  Sometimes  A{c)  is  a  nonlinear  matrix  function  of  c.  In  §2.1 
we  briefly  discuss  how  to  adapt  the  numerical  methods  for  this  case.  Other  applications 
lead  to  variations  of  Problem  1  that  include  the  following:  the  number  of  given  eigenvalues 
is  less  than  n,  the  order  of  the  matrices;  the  number  of  parameters  is  not  the  same  as 
n;  there  cure  constraints  on  c;  there  is  a  functional  to  be  minimized  subject  to  eigenvalue 
constraints  of  the  form  given  by  Problem  1;  the  constraints  on  some  of  the  eigenvalues 
are  inequalities  instead  of  equalities.  (This  last  case  seems  to  be  particularly  common  in 
practical  applications.)  In  §1.1  we  give  a  few  examples  to  illustrate  how  some  of  these 
applications  arise.  We  think  that  it  is  not  difficult  to  see  how  the  problem  formulations, 
numerical  methods,  and  convergence  analyses  can  be  extended  to  some  of  the  variations 
of  Problem  1.  However  we  shall  not  give  any  details  here. 

A  special  case  of  Problem  1,  which  is  frequently  encountered,  is  obtained  when  the  linear 
family  (l.l)  is  defined  by 


Ak  =  efcCjij  k  =  l,...,n, 


where  e^  is  the  k^    unit  vector,  so  that 

A{c)  =  Ao  +  D  (1.2) 

where  D  =  diag{ck)-   This  problem  is  known  as  the  additive  inverse  eigenvalue  problem. 
Conditions  for  existence  and  uniqueness  of  solutions  to  this  problem  are  well  understood. 


Friedland  (1977)  showed  that  the  problem  is  always  solvable  over  the  complex  field,  and 
it  is  easy  to  construct  examples  that  show  that  it  is  not  always  solvable  over  the  reals. 


§1.1  Applications. 

We  will  now  describe  several  inverse  eigenvalue  problems  arising  in  various  areas  of 
application. 

One  classical  example  is  the  solution  of  inverse  Sturm-Liouville  problems.  Consider  for 
example  the  boundary  value  problem 

-u"(x)  +p{x)u(x)  =  Au(x) 

u(0)  =  u(7r)  =  0. 

Suppose  that  the  potential  p(i)  is  unknown,  but  the  spectrum  {A'}f^  is  given.  Can 
we  determine  p{x)l  This  continuous  problem  has  been  studied  by  many  authors;  see 
Borg  (1946),  Gelfand  and  Levitan  (1955)  and  Hald  (1972).  We  are  mainly  interested  in 
the  discrete  analogue  of  this  problem.  Let  us  use  a  uniform  mesh,  defining  h  =  ;^^, 
Ufc  =  u{kh),  pk  =  p{kh),  k  =  l,...,n,  and  assume  that  {A^}^  is  given.  Using  finite 
differences  to  approximate  u"  we  obtain 


-ujt+i  +2ufc  -  Ufc_ 


-  +  PfcUfc  =  AyUfc,  k  =  l,...,n         uo  =  tin+i  =  0,  (1.3) 


where  A'  is  an  eigenvalue  in  the  set  {A,'}^.  Thus  we  have  an  additive  inverse  eigenvalue 
problem.  (1.2)  with 


^0  =  ^ 


f      2-1 
-  1      2-1 
-  1      2-1 
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and  D  =  diag[pk).  Hald  (1972)  is  a  comprehensive  reference  for  both  the  continuous  and 
discrete  inverse  Sturm-Liouville  problem. 


Another  interesting  inverse  eigenvalue  problem  is  obtained  by  studying  a  vibrating 
string.  Here  the  boundary  value  problem  is 

-u"(x)  =  Xp{x)u{x) 

(1-5) 
u(0)  =  u(7r)  =  0. 

The  question  is  whether  we  can  determ.ine  the  density  function  p(x)  >  0  from  the 
eigenvalues  {A*}^.  Discretizing  the  problem  as  before  we  obtain 

Au  =  \^Du         z  =  l,...,n 
or  equivalent  ly 

D~^Au  =  \iU         :  =  l,...,n,  (1.6) 

where  D  =  diag(^p{kh))  >  0  and  A  is  given  by  the  right  hand  side  of  (1.4).  This  kind 
of  problem  is  called  the  multiplicative  inverse  eigenvalue  problem:  given  a  real  symmetric 
matrix  A  and  eigenvalues  {A*}J,  find  a  positive  diagonal  matrix  V  such  that  VA  has 
the  given  eigenvalues.  We  can  write  this  problem  in  the  form  (l.l)  where  ^o  =  0,-4^  = 
efcfl^,  A;  =  1, . . . ,  n,  and  where  a  J  is  the  k^^  row  of  A.  The  matrices  Ak  are  not  symmetric 
in  this  case.  Note,  however,  that  a  diagonal  similarity  transformation  applied  to  VA  gives 
the  symmetric  matrix  V2AV2. 

Both  the  additive  and  multiplicative  inverse  eigenvalue  problems  were  posed  by  Down- 
ing and  Householder  (1956).  In  practical  applications  of  the  inverse  Sturm-Liouville  and 
inverse  vibrating  string  problems,  only  a  few  of  the  smallest  eigenvalues  may  be  given. 
In  order  for  the  problem  to  be  well-posed,  the  number  of  parameters  must  be  reduced 
accordingly.  This  can  be  done  by  expressing  the  potential  or  density  function  as  a  linear 
combination  of  a  few  given  basis  functions.  See  Osborne  (1971)  and  Hald  (1972)  for  details. 

Problem  1  also  arises  in  nuclear  spectroscopy  (see  Brussard  and  Glaudemans  (1977)). 
There  A{c)  is  the  Hamiltonian  ajid  the  set  {A*}  is  obtained  from  experimental  measure- 
ments. A  similar  problem  occurs  in  molecular  spectroscopy  (see  Pliva  and  Toman  (1966) 
and  Friedland  (1979)).  Practical  formulation  of  these  problems  often  involves  a  number  of 


parameters  which  is  smaller  than  the  number  of  given  eigenvalues.  It  is  therefore  appro- 
priate to  consider  a  least  squares  formulation: 

m 

t=i 

The  methods  that  we  shall  discuss  for  solving  (l.l)  can  be  generalized  to  handle  (1.7)  by 
using  well  known  techniques  (see,  for  example,  Dennis  and  Schnabel  (1983),  chapter  10). 

The  problem  of  communality,  which  arises  in  factor  analysis  (see  Harman  (1967),  chapter 
5),  is  as  follows.  Let  Ao  be  a  given  real  symmetric  matrix  with  zero  diagonal  entries.  The 
objective  is  to  find  a  diagonal  matrix  D  such  that  Aq  +  D  has  minimal  rank.  In  other 
words,  the  goal  is  to  find  D  such  that  Aq  +  D  has  as  many  eigenvalues  equal  to  zero 
as  possible.  This  problem  is  different  from.  Problem  1,  since  neither  the  rajik  nor  the 
nonzero  eigenvalues  are  known.  However,  we  can  guess  the  rajik  ajid  hence  the  number  of 
eigenvalues  which  are  equal  to  zero.  In  some  cases  this  is  enough  to  locally  determine  a 
solution,  as  we  shall  explain  in  §3. 

In  the  educational  testing  problem  (see  Fletcher(l985)),  we  are  given  asymmetric  positive 
definite  matrix  Aq  aind  wajit  to  know  how  much  can  be  subtracted  from  the  diagonal  of 
Aq,  with  the  restriction  that  the  resulting  matrix  is  positive  semi-definite.  The  problem 
may  be  posed  as  follows: 

n 

mcLX  7     Ck 

(1.8) 
subject  to         A,(yio  —  D)  >  0         i  =  I, . . .  ,n 

D  =  diag[ck)  >  0. 

In  this  problem,  as  in  the  problem  of  communality,  we  can  usually  expect  a  multiple 
zero  eigenvalue  at  the  solution.  Fletcher  (1985)  also  describes  a  problem  that  has  the 
same  structure  as  (1.8),  which  he  calls  the  matrix  modification  problem.  We  are  given  a 
syrometric  indefinite  matrix  Aq  and  want  to  add  as  little  as  possible  to  the  diagonal  of  Aq 
to  obtain  a  positive  semi-definite  matrix. 


An  important  class  of  problems  frequently  occurring  in  engineering  applications  has  the 
form 

min  /(c) 

^^^^  (1.9) 

subject  to     /  <  A,(c)  <  u     i  =  1, . . . ,  n., 

where  /(c)  is  a  real-valued  objective  function  and  /  amd  u  are  specified  lower  and  upper 
bounds  on  the  eigenvalues  of  the  matrix  A{c)  given  by  (l.l).  If  aji  optimization  method 
based  on  active  sets  of  inequality  constraints  is  used,  i.e.,  where  the  inequalities  thought 
to  be  binding  are  replaced  by  equality  constraints,  one  has  a  problem  closely  related  to 
Problem  1.  (The  same  remark  applies  to  the  educational  testing  problem,  which  has 
constraints  of  the  form  A,(i)  >  0.)  It  is  interesting  to  note  that  multiple  eigenvalues 
will  naturally  tend  to  occur  at  a  solution,  since  the  minimization  objective  may  drive 
several  eigenvalues  to  the  same  bound.  It  is  therefore  very  important  to  handle  multiple 
eigenvalues  correctly.  We  will  explain  how  to  do  this  in  §3,  in  the  context  of  Problem  1. 
A  closely  related  problem  hais  the  simple  form 

minu 

(1.10) 
subject  to     Ai(c)  <  u     :  =  l,...,n, 

i.e.  the  object  is  to  minimize  An(c),  the  maximum  eigenvalue  of  A{c).  Again,  multiple 
eigenvalues  generally  occur  at  the  solution.  There  is  a  large  engineering  literature  on 
problems  of  the  form  (1.9)  and  (1.10).  See,  for  example,  Polak  and  Wardi  (1982)  and  Mayne 
and  Polak (1982)  for  a  discussion  of  problems  from  control  theory  involving  restrictions  on 
singular  values  of  a  transfer  matrix,  and  Olhoff  and  Taylor  (1983)  for  a  discussion  of 
problems  from  structural  analysis  and  optimal  design. 

Another  interesting  variation  is  the  graph  partitioning  problem,  given  by  CuUum.,  Donath 
and  Wolfe  (1975).  The  objective  is  to  minimize  the  sum  of  the  largest  eigenvalues  of 
a  symmetric  matrix,  as  a  function  of  its  diagonal  entries.  More  precisely,  consider  the 
problem 
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min  tpiD)  =    ^    \i{Ao  +  D) 

t  =  n  — r 

subject  to     trace{D]  =  0, 

where  the  symmetric  matrix  Aq  and  the  integer  r  are  given,  and  D  is  diagonal.    This 
problem  can  be  transformed  into 


min    }      Ui 


mm 

t=n  — r 


subject  to     trace{D)  =  0 
Xi{Ao  +  D)  <  Ui         n  —  r  <  i  <  n. 

One  therefore  has  a  problem  closely  related  to  (1.9). 

There  axe  several  inverse  eigenvalue  problenas  with  special  structure  that  can  be  solved 
by  direct  methods.  An  example  of  this  is  the  reconstruction  of  Jacobi  matrices  from 
spectral  data;  see  de  Boor  and  Golub(1978).  We  will  not  consider  these  types  of  problems. 

§1.2  Notation  and  Definitions. 

We  define  A(c)  =  [Ai(c), . . . ,  An(c)]^  and  A(c)  =  diag{Xi{c)).  A  solution  to  Problem  1 
will  be  denoted  by  c*,  and  we  write  A*  =  [Ai,...,A*]^  and  A*  =  diag{X^).  Since  A{c) 
is  symmetric,  it  has  an  orthonormal  set  of  eigenvectors  {gt(c)}"-  The  orthogonal  matrix 
Q{c)  =  [qi{c), . . .  ,qn{c)]  will  be  called  a  matrix  of  eigenvectors  of  A{c).  Throughout  the 
paper  ||  •  j|  denotes  the  Euclidean  vector  norm  (£2  —  norm)  or  its  corresponding  induced 
matrix  norm,  and  ||  •  ||f  the  Frobenius  matrix  norm. 

§2.    DISTINCT  EIGENVALUES 

We  will  now  describe  several  methods  for  solving  Problem  1  in  the  case  where  the  given 
eigenvalues  are  distinct.  In  §3  we  will  see  how  to  cope  with  multiple  eigenvalues.  Assume 
there  exists  a  solution  c'  to  Problem  1.  Then  there  is  a  neighborhood  of  c"  where  the 
eigenvalues  A,(c)  are  distinct  and  are  differentiable  functions  (see  for  example  Ortega 
(1972),  p. 54).  In  this  neighborhood  we  will  consider  the  nonlinear  system  of  equations 
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/w  = 


Ai(c)-AJ 


.An(c)-A;. 


(2.1) 


The  first  method  we  will  describe  consists  of  applying  Newton's  method  to  (2.1).   Dif- 
ferentiating the  relations 


one  finds  that 


Thus  the  Jacobian  of  /  is 


?t(c)^g,(c)  =  1 
qi{c)'^  A{c)qi{c)  =  A,(c), 


5A,(c)  ,  .J. 


dci 


=  ?i(c)    Akqi{c). 


(2.2) 
(2.3) 


(2.4) 


Jik{c)  =  qi{c)'^Akqi{c), 
and  one  step  of  Newton's  method  is  defined  by 


(2.5) 


j{c^){c'+^-cn  =  -f{cn- 


We  will  write  (2.6)  in  a  slightly  different  form.  From  (l.l)  and  (2.3)  we  have 


qi{c)'^  Aoqi{c)  +  ^qi{c)'^  Akqi{c)ck  =  A,(c), 


k=i 


and  therefore  (2.6)  can  be  written  as 


(2.6) 


or  equivalently, 


qi{cn^A{c^+')qi{cn  =  X:, 


J{c'')c^+^  =  X'  -b{c^) 


(2.7) 


>.8) 


where 


bi{c]  =  q,{c)^  Aoqi{c]  z  =  l,...,n.  (2.9) 

Thus  Newton's  method  for  solving  (2.1)  is: 

Method  I. 

Choose  a  starting  value  c°.  Form  A(c°)  and  find  its  eigenvalues  and  eigenvectors. 
Fori/ =  0,1,2,...  ,^, 

1.  Stop  if  ||A(c'')  -  A*  II  is  sufficiently  small. 

2.  Form  7(0")  (see  (2.5))  and  6(0")  (see  (2.9))  and  compute  c'"^^  by  solving  (2. 8). Form 
A(c^+i). 

3.  Find  the  eigenvalues  {A,(c''+^)}  and  eigenvectors  {?,(c'"^^)}  of  A(c'^+^). 

Method  I  has  been  studied  by  many  autnors.  An  early  reference  is  Downing  ajid  House- 
holder (1956),  where  the  method  is  proposed  for  solving  the  additive  inverse  ajid  multi- 
plicative inverse  eigenvalue  problems.  Physicists  have  used  it  for  many  years  in  nuclear 
spectroscopy  calculations  (see  Brussard  and  Glaudemans  (1977)).  Kublanovskaja  (1970) 
has  given  a  convergence  ajialysis  of  this  method. 

Describing  the  iteration  by  means  of  (2.6)  seems  more  natural  than  using  (2.8).  However, 
the  latter  has  the  sajne  form  as  the  next  two  methods  we  will  present  below.  Also  (2.8) 
shows  that  the  direction  produced  by  Newton's  method  does  not  depend  explicitly  on  the 
eigenvalues  A(c''). 

Instead  of  computing  the  eigenvectors  of  A{c)  at  each  step  we  may  consider  approxi- 
mating them.  One  possibility  is  to  use  inverse  iteration.  Suppose  that  c^  is  our  current 
estimate  of  the  parameters  and  Q^^^  is  an  approximation  to  Q{c'^),  the  matrix  of  eigen- 
vectors of  ^(c").  Let  q';'  be  the  t*''  column  of  Q^").  To  compute  a  new  estimate  c""^^  we 
form 

Jik^  =  {qif^ic<l^  i,k  =  l,...,n  (2.10) 

6r  =  (g.n^'^ogr  i  =  l,...,n  (2.11) 
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and  solve 

j('')c''+i  =  A- -6".  (2.12) 

(Compare  with  (2.5),  (2.9)  and  (2.8)).  To  update  our  approximations  to  the  eigenvectors 
we  apply  one  step  of  inverse  iteration:  we  compute  7,-, :  =  1, . . . ,  n  by  solving 

[^(c''+^)-A;Jl7.  =  ?r         i=l,...,n.  (2.13) 

We  then  define 

'•     =w      '  =  ''■■■'''' 

which  determine  the  new  matrix  Q^^^^' .  Thus  we  are  performing  a  Newton-like  iteration 
where  instead  of  computing  the  exact  eigenvectors  of  A{c)  at  each  step  we  update  an 
approximation  to  them  by  performing  one  step  of  inverse  iteration. 

Method  II. 

Choose  a  starting  value  c°.  Form  A{c°)  and  compute  its  matrix  of  eigenvectors  Q(c°). 
Set  g(o)  ^  Q{c°). 
For  1/ =  0,1,2, ... 

1.  ff  ||(g(''))^yi(c'')(5('')  -  A* Hi.  is  sufficiently  small,  stop. 

2.  Form  J^"^  (see  (2.10))  and  6"  (see  (2.11))  and  compute  c''+^  by  solving  (2.12).  Form 

3.  Compute  the  factorization 

where  U  is  orthogonal  and  N  is  tridiagonal.  Solve  the  n  linear  systems 

and  compute 
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Method  II  is  closely  related  to  a  method  proposed  by  Osborne  (1971);  see  also  Hald  (1972). 
We  have  used  a  different  right-hand  side  vector  in  (2.12)  to  take  advantage  of  the  fact  that 
A{c)  is  an  affine  function.  As  we  will  discuss  below,  the  form  of  the  method  proposed  by 
Osborne  can  be  useful  when  A{c)  is  a  nonlinear  function. 

A  different  approach  is  based  on  the  use  of  matrix  exponentials  and  Cay  ley  transforms. 
A  solution  to  Problem  1  can  be  described  by  c  and  Q,  where  Q  is  an  orthogonal  matrix 
and 

Q^A{c)Q  =  A*.  (2.14) 

Suppose  that  Q^"'  is  our  current  estimate  of  Q.  Let  us  write  Q  =  Q^^'e^  where  Y  is  a 
skew-symmetric  matrix,  i.e.,  Y^  =  —Y.  Then  (2.14)  can  be  written  as 

(Q(''))^A(c)g('')  =  e^A-e-^ 

=  {I  +  Y+^Y^+  . .  .)A*(J  -Y  +  ^Y^+  ...). 
Thus 

(Q(''^)^A(c)g('')  =  A-  +  Yx'  -  A-y  +  o(|iy  11^).  (2.15) 

We  now  define  a  new  estimate  of  the  parameters  c"'''^  by  neglecting  second-order  terms  in 
Y  and  equating  the  diagonal  elements  of  (2.15).  We  obtain 

{qrfA{c^^')gr  =  X:         :  =  l....,n,  (2.16) 

which  is  identical  to  (2.12)  with  J^"^  and  6^'')  defined  by  (2.10)  and  (2.11).  Thus  the 
new  estimate  of  c'^'^^  is  obtained  in  the  same  way  as  in  Methods  I  and  II.  Equating  the 
off-diagonal  elements  of  (2.15),  with  second  order  terms  neglected,  we  have 

y..(A;  -  a;)  =  {q!^fA{c-^^')q^  l<i<j<n.  (2.17) 
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The  matrix  Y  is  completely  determined  by  (2.17),  since  Y  =  —Y'^  and  we  are  assuming 
tha;t  {A*}  are  distinct.  Now  construct  an  orthogonal  matrix  P  using  the  Cayley  transform 

P  =  [I  ^  \Y){I  -  \Y)-^  (2.18) 

ajid  compute  the  new  estimate  of  the  matrix  of  eigenvectors  by 

gC^'+i)  =  Q{^)p  (2.19) 

As  we  neglected  second  order  terms,  Q^""^^)  is  only  an  approximation  to  the  desired  matrix 
and  we  need  to  iterate. 

Method  in. 

Choose  a  starting  value  c°.  Form  >l(c°)  and  compute  its  matrix  of  eigenvectors  Q(c°). 
Set  g(°)  ^  Q(cO). 
Fori/ =  0,1,2,... 

1.  If  ||(g(''))^A(c'')g('')  -  A*|lir  is  sufficiently  small,  stop. 

2.  Form  J^")  (see  (2.10))  and  6"  (see  (2.11))  and  compute  0"+^  by  solving  (2.12).  Form 
A(c''+i). 

3.  For  z  =  1, . . . ,  n  and  j  =  i  +  1, . . . ,  n,  compute 

_  (9rr^4(c-+M< 

^''  ~       a;  -  A*      • 

4.  (Note  that  (2.18)  and  (2.19)  imply 

Compute  H  ^{I  -  \Y){Q^^^)'^).  Let  h,  be  the  :*^  column  of  H. 
Factorize  the  matrix  /  -i-  |y,  and  use  this  to  solve  the  n  linear  systems 

[I+\Y)vi  =  hi         :-  =  l,...,n. 

Set 
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This  method  is  apparently  new.  Downing  and  Householder(l956)  use  the  Cayley  transform 
to  motivate  the  Newton  step  (2.6)  in  Method  I.  However,  they  do  not  suggest  updating 
approximations  to  the  eigenvectors,  but  instead  compute  the  exact  eigenvectors  of  A{c)  at 
each  step. 

One  can  motivate  Method  III  following  a  different  reasoning.  Suppose  that  we  are  given 
am  initial  matrix  5^°^  whose  eigenvalues  coincide  with  the  target  eigenvalues  {A"}.  IS  5^°' 
can  be  written  in  the  form  (1.1),  the  problem  is  solved.  Otherwise  we  generate  a  sequence 
{B^^^},i/  —  1,2,...,  which  converges  to  a  matrix  of  the  form  (l.l),  and  with  the  property 
that  each  matrix  in  the  sequence  has  a  spectrum  that  coincides  with  the  target  spectrum. 
Given  B^^\  we  would  like  to  find  a  skew-symmetric  matrix  Z  and  a  vector  c''"^^  such  that 

Expanding  e^  ajid  neglecting  second  order  terms  we  obtain 

The  diagonal  equations  determine  c^""^^',  as  before,  and  the  off-diagonal  equations  deter- 
mine Z.  We  now  let  R  =  {I  +  \Z)[I  -  \Z)-^  and  define  £(''+1)  =  R^B^'^^R.  To  find 
S^°)  we  may  proceed  as  follows.  Let  c°  be  our  first  estimate  of  the  parameters.  Compute 
the  eigenvectors  {gt(co)}  of  ■A(c°)  and  define 


B^'^=j:X'Mc^)g,{cy 


t=l 
Then  B^°^  has  the  target  spectrum  and  its  eigenvectors  coincide  with  those  of  A{c°).  It  is 
not  difficult  to  see  that  this  process  is  identical  to  Method  HI. 

Let  us  now  look  at  a  different  formulation  of  Problem  1.  Consider  the  nonlinear  system 


9{c)  = 


det{A{c)  -XII) 


_det{A{c)  -  A;J) 
Note  that  the  i*''  equation  of  (2.20)  can  be  written  as 
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=  0.  (2.20) 


gi{c)  =  Y[{\k{c)-x:). 


(2.21) 


fc=l 


To  apply  Newton's  method  to  this  new  system  we  first  need  to  compute  the  Jacobian. 
From  (2.4)  and  (2.21)  it  follows  that 


dcj 


Therefore  the  Jacobian  of  g  is 


=  J2['j^A:g>^I[Mc)-K)Y 


k=l 


(2.22) 


G{c) 


n(A^(c)-At)    ...     Y[{Mc)-K) 


n  (A£(c)  -  a; 


qi{c)'^  Aiqi{c)       ...      (?!  (c)^An9i(c) 


.g„(c)^Aig„(c)      ...     ?„(c)^yln9n(c). 


=  diag{gi{c)) 


Ai(c)-AJ       ••■  A„(<:)-A- 

1  1 

Ai(c)-A;      •••  An(c)-A; 

1 


J{c) 


=  diag{gi[c))  ■  diagi-—-  j  ■  V [c] 
where  J{c)  is  given  by  (2.5)  and  V{c)  is  defined  by 

^    ,,        ^^   [qk{c)''A,q,{c)][X.{c)-\l] 


k=l 


The  Newton  iterate  is  therefore 


(2.23) 


(2.24) 


^"+1  =  c-  -  V{cn-'diag{Mcn)diag  (^^-^  )  gic^ 
=  c''-V{cn-'f{cn- 
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Method  IV. 

Choose  a  starting  value  c°.  Form  j4(c°)  and  compute  its  eigenvalues  and  eigenvectors. 
For  u  =  0,1,- 

1.  Stop  if  ||A(c'')  —  A* II  is  sufficiently  small. 

2.  Form  V[c'')  (see  (2.24))  and  compute  0"+^  by  solving 

Form  A(c''+i). 

3.  Find  the  eigenvalues  {A,(c''+^)}^  and  eigenvectors  {g.(c''+^)}^  of  A(c''+^). 

This  method  was  proposed  by  Biegler-Konig  (1981)  and  generalizes  an  algorithm  of  Lan- 
caster (1964-a).  To  show  its  relation  to  Method  I  we  note  that  V[c)  can  be  written  as 


V{c)  =  W[c)-J[c), 


where  the  matrix  W  is  defined  by 


W{c)  = 


A:(<:)-At 
A2(c)  -XI 

An(c)-A;    a.(c)-a; 


Ai(c)-At 
An(c)  -A* 


(2.25) 


LAi(c)-A;     A2(c)-A;     ••• 

Thus  Method  IV  differs  from  Method  I  in  that  J{c'^)  has  been  replaced  by  W{c'^)J{c'^). 
Since  the  given  eigenvalues  {A*}"  are  distinct,  W{c)  — +  /  as  c  -^  c',  and  so  asymptotically 
Methods  I  and  IV  coincide.  Nevertheless,  our  numerical  experience  indicates  that  Method  I 
almost  always  requires  fewer  iterations,  and  that  Method  IV  suffers  more  often  from  ill- 
conditioning.  One  can  readily  see  the  drawback  of  using  formulation  (2.20)  by  noting 
that 


9^{c)  =  Mc)Yl{\k{c)-X:). 


(2.26) 


One  is  thus  complicating  system  (2.1)  by  multiplying  each  equation  by  a  polynomial  of 
degree  n  —  1.    Suppose,  for  example,  that  the  problem  is  so  simple  that  the  functions 
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{A,(c)}"  are  linear.  Then  (2.1)  is  a  linear  system  of  equations  and  Method  I  will  find 
the  solution  in  one  step.  On  the  other  hand,  (2.20)  represents  a  system  of  polynomial 
equations  of  degree  n,  amd  Method  IV  will  have  to  iterate  to  approach  the  solution.  It 
therefore  seems  that  Method  I  is  always  to  be  preferred  over  Method  TV. 

Let  us  now  compare  the  computational  requirements  of  Methods  1,11  and  III.  We  will 
first  assume  that  A{c)  is  dense.  The  computation  of  the  eigenvalues  and  eigenvectors  in 
Method  I  requires  approximately  5n^  multiplication  operations;  see  Golub  and  Van  Loan 
(1983).  Method  11  requires  approximately  2n^  multiplications  to  update  the  q^  ,  whereas 
Method  m  requires  approximately  4n."'  multiplications  in  Steps  3  ajid  4.  Note  that  all  the 
methods  require  n'^  multiplications  to  form  the  matrix  J  in  Step  2.  However,  in  the  case 
of  the  additive  inverse  problem  (1.2),  forming  J  requires  only  n"^  operations.  If  the  matrix 
A{c)  is  sparse,  Method  HI  becomes  less  competitive.  For  example,  if  A{c)  is  tridiagonal. 
Method  I  requires  only  about  14n-^  multiplications  per  iteration  (9n^  to  compute  the 
eigenvalues  (Parlett  (1980,  p. 165))  plus  5n.^  to  find  the  eigenvectors  by  inverse  iteration), 
while  Method  II  requires  on"^  multiplications  in  Step  3  and  Method  III  requires  about  3n"^ 
multiplications  in  Steps  3  and  4.  In  §4  we  will  comment  on  the  numerical  behavior  of  the 
three  methods. 

Methods  I,  H,  HI  ajid  FV  are  locally  quadratically  convergent  under  some  nonsingularity 
assumptions.  This  will  be  shown  in  §3.3.  There  are,  on  the  other  hand,  various  methods  for 
solving  Problem  1  that  are  only  linearly  convergent.  One  of  these  methods  was  proposed 
by  Downing  and  Householder  (1956).  The  iteration  is 

c^+i  =  0"-  (A(c)-A*), 

ajid  thus  is  obtained  from  Method  I  by  replacing  J{c^)  by  the  identity  matrix.  A  different 
method,  specifically  designed  for  solving  the  additive  inverse  eigenvalue  problem  (1.2),  was 
proposed  by  Hald  (1972).  Suppose  that  ZP^"^  is  our  current  estimate  of  the  desired  diagonal 
matrix,  ajid  that  Q^'^^  is  the  matrix  of  eigenvectors  of  Aq  +  D^^K  We  define  Z)('^+-)  as  the 
diagonal  matrix  that  solves  the  problem 
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min||(Ao  +  D)g('')-Q(''U*|ljr. 


Since 


\\{Ao  +  D)Q^^^  -  Q^''^\'\\f  =  \\D  -  (Q('')A-(g(''))^  -  Ao)\\f 
it  is  clear  that 

i^^^+i)  =  diag{{Q^''^y{Q^''Y  -  Ao)^^).  (2.27) 

Thus  in  this  method  one  computes  the  eigenvectors  at  each  step  and  updates  the  estimate  of 
D  by  means  of  (2.27).  Friedland  (1977)  generalized  this  method  for  more  general  functions 
A{c). 

A  method  closely  related  to  (2.27)  can  be  used  for  solving  the  problem  of  communality. 
Recall  that  the  problem  is:  given  -4o,  find  a  diagonal  matrix  D  such  that  Aq  +  D  has  as 
small  rank  as  possible.  We  start  by  maJcing  a  guess  of  the  rank:  say  that  it  is  n  —  t.  Thus 
t  eigenvalues  will  be  zero  at  the  solution.  Suppose  that  D^"^  is  our  current  estimate,  and 
let  Q^"^  and  A^")  be  the  matrices  of  eigenvectors  and  eigenvalues  of  Aq  -r  D^'^'.  We  define 

where  A  is  obtained  from  A^"^  by  setting  the  t  smallest  diagonal  elements  (in  absolute 
value)  to  zero.  Ideally  we  would  like  to  use  A'  instead  of  A  ,  but  only  t  zero  elements 
of  A"  are  known.  This  method  is  described  in  Holzinger  and  Harman  (1941),  and  our 
numerical  experience  indicates  that  it  is  robust  but  very  slowly  convergent. 

We  will  now  discuss  what  chajiges  are  needed  in  the  numerical  methods  described  so 
far,  when  A{c)  is  not  affine,  but  is  a  nonlinear  function  of  c.  Note  that  (2.4)  should  be 
replaced  by 

^M£)=^,(,)T^,(,)<^..(e)  (2.28) 
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where 

^fc(^)  =  5— ^(^)-  (2-29) 

Method  I  is  then  defined  by  (2.6)  where  /(c")  is  formed  by  using  (2.28).  Note  that  (2.8) 
cannot  be  used  since  it  was  derived  under  the  assumption  that  A[c)  is  affine.  Similarly, 
for  Method  IV  we  need  only  replace  Aj  by  A_,  (c)  in  (2.24). 

For  Method  II  we  do  not  wish  to  compute  the  eigenvalues  A,(c)  required  for  (2.6).  A 
natural  modification  to  Method  II  is  to  consider  an  iteration  of  the  form  (2.6),  where 
the  vectors  qi  are  updated  by  using  the  inverse  iteration  (2.13),  ajid  the  eigenvalues  A,(c) 
are  approximated  by  means  of  the  Rayleigh  quotient,  i.e.,  A,(c)  =  qj A[c)qi.  A  differ- 
ent approach  was  suggested  by  Osborne  (1971)  and  does  not  require  approximating  the 
eigenvalues  Ai(c)  explicitly.  He  defines  the  function  /?(c)  =  [/3i(c), . . .  ,/3,i(c)]^  by 

/?t(c)  =  (^f  ?i)"'         t  =  l,...,n, 
where  {^i}"  axe  our  current  approximations  to  the  eigenvectors  and  the  7,-  are  given  by 

[A{c)  -\'-I\-ii  =  qi         t  =  l,...,n. 

If  we  apply  one  step  of  Newton's  method  to  the  system  /3(c)  =0  we  obtain  an  iteration  of 
the  form 

J(0(c''+i-c'')  =  -/3(0, 
where 

and  Akic)  is  defined  by  (2.29).  It  is  not  difficult  to  see  that  0{c)  approximates  A(c)  -  A'; 
see  Hald  (1972)  for  details. 

Finally,  consider  Method  III.  Equation  (2.16)  is  now  a  nonlinear  equation  in  c.  We  can, 
however,  replace  ^(0""*"^)  in  (2.16)  by  the  first  order  approximation 

A(c^)  +  A,  {cnicr'  -  cr)  +  •  •  •  +  AninK-^'  -  <) 
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to  obtain 


J(c'')(c''+'  -c")  =  -(A''-A'), 


where 


and  A"  =  [A5f , . . . ,  Aj;]^  is  defined  by 

Ar  =  kn^^lc")?.", 

the  Rayleigh  quotient.  After  c^^^  has  been  computed  we  update  the  vectors  qi  by  (2.17)- 
(2.19). 

§3.    MULTIPLE  EIGENVALUES 

In  this  section  we  suppose  that  {A,'}  includes  multiple  eigenvalues,  and  that  a  solution 
c'  to  Problem  1  exists.  We  first  describe  how  the  methods  of  §2,  without  modifications, 
behave  in  this  case,  ajid  then  explain  how  the  problem  formulation  and  methods  should 
be  changed  when  multiple  eigenvalues  axe  present.  For  convenience  we  assume  that  only 
the  first  eigenvalue  is  multiple,  with  multiplicity  t,  i.e., 

Aj  =  A2  =  •  •  •  =  Aj  <  X^j^i  <  •  •  •  <  A^. 
There  is  no  difficulty  in  generalizing  all  our  remarks  to  an  arbitrary  set  of  given  eigenvalues. 

§3.1  Behavior  of  Unmodified  Methods. 

Let  us  first  consider  Method  I.  When  the  given  eigenvalues  are  distinct  it  is  straightfor- 
ward to  see  that  Method  I  is  locally  quadratically  convergent,  under  the  assumption  that 
the  Jacobian  (2.5)  is  nonsingular  at  c' .  The  reason  is  that  Method  I  consists  of  applying 
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Newton's  method  for  finding  a  zero  of  /,  defined  by  (2.1),  which  is  a  smooth  function.  In 
fact,  the  first  partial  derivatives  of  A,(c)  are  given  by  (2.4),  and  it  is  not  difficult  to  show 
that 


dckdcj 


q^{c)^Akqi{c)][q,{c)^A,q,{c)\ 
A»(c)-A^(c) 


(3.1) 


(see  for  example  Lancaster  (1964-b)).  Thus  /  satisfies  the  well-known  conditions  for 
quadratic  convergence  of  Newton's  method:  (i)  /  is  differentiable  and  J{c)  is  Lipschitz 
continuous  in  a  neighborhood  of  c*;  (ii)  J{c*)  is  nonsingular.  (See  Ortega  and  Rheinboldt 
(1970),  p.  312). 

However,  we  caji  see  from  (3.1)  that  as  the  separation  of  the  eigenvalues  decreases,  the 
Lipschitz  constant  generally  grows,  the  problem  becomes  increasingly  ill-conditioned,  and 
the  neighborhood  of  the  solution  in  which  convergence  takes  place  becomes  smaller.  When 
the  sepajation  is  zero,  i.e.,  when  multiple  eigenvalues  are  present,  the  eigenvalues  are  not, 
in  general,  differentiable  at  c*.  Furthermore,  the  eigenvectors  {qi{c*)}  are  not  unique,  and 
they  cannot  generally  be  defined  to  be  continuous  functions  of  c  at  c*.  This  can  be  seen 
by  considering  the  example  Aq  =  0, 


A,  = 


1  0 

0      -1 


A2   = 


0  1 

1  0 


c*  =  A-  = 


Note  that  as  long  as  the  eigenvalues  of  A{c^)  are  distinct  for  all  iterates  c",  Method  I 
remains  well-defined.  However,  the  matrix  of  eigenvectors  Q{c'''),  and  consequently  the 
Jacobian  J{c^),  generally  will  not  converge  as  c"  — ►  cV  Therefore  one  might  expect  that 
in  the  multiple  eigenvalue  case  the  method  is  at  best  slowly  convergent  to  c'.  In  fact, 
however,  the  convergence  is  generally  quadratic,  both  in  theory  and  in  practice.  This  fact 
was  apparently  first  established  by  Nocedal  and  Overton  (1983),  although  Bohte  (1967- 
68)  had  observed  earlier  that  the  method  experienced  no  difficulties  in  his  numerical  tests 
with  multiple  eigenvalues.  The  quadratic  convergence  may  be  explained  in  several  ways. 
Nocedal  and  Overton  (1983)  base  their  analysis  on  a  classical  result  of  Rellich  (1969),  which 
states  that  the  eigenvalues  can  be  defined  to  be  analytic  functions  of  a  single  variable,  along 
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any  line  passing  through  the  solution  c' .  By  using  the  meaji  value  theorem  in  one  variable  it 
follows  that,  locally,  every  Newton  step  produces  a  quadratic  contraction  in  the  error.  The 
result  is  that,  given  a  nonsingularity  assumption,  the  iterates  {c"}  converge  quadratically 
although  the  sequence  {J{c^)}  does  not  converge.  A  completely  different  proof  of  this 
result  will  be  given  in  §3.3. 

We  comment  further  here  on  the  nonsingularity  assumption  needed  for  quadratic  con- 
vergence. It  is  sufficient  to  assume  that  {A{c'^)}  has  distinct  eigenvalues  for  all  u  and  that 
{II J(c'')~^||}  is  bounded.  Even  though  this  condition  usually  holds  in  practice,  it  would 
be  more  desirable  to  state  the  nonsingularity  assumption  in  terms  of  a  matrix  evaluated 
at  c*.  Since  the  matrix  of  eigenvectors  Q  is  not  uniquely  determined  at  c*,  let  us  define 

n  =  {Q:Q^Q  =  I    and    Q^A{c')Q  =  A'} 
and,  for  any  Q  E  Q,  define  J'[Q)  by 

J:,,[Q)  =  qj  Ai^qi. 
To  obtain  a  useful  nonsingularity  assumption  we  may  consider 

sup{||j*(g)-^||}. 

However,  it  turns  out  that,  in  general,  this  supremum  does  not  exist.  By  solving  a  system 
of  n  —  i  -h  1  linear  equations  in  n  unknowns  and  doing  an  appropriate  transformation,  it 
is  generally  possible  to  choose  the  eigenvectors  so  that  J*{Q)  is  singular.  To  see  this,  let 
Q  =  [?i,...,9n]  be  any  set  of  eigenvectors  oi  A{c').  Consider  the  system. 


jt=i  ^.=1 


2i^(^gfAfc9.)xjt  =  0 


^IqjAkqAxk^Q         :  =  i  +  l,...,n, 
;t=i  ^  ^ 
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where  the  unknown  x  =  [xi, . . .  ,Xn\'^  €  R''.  This  homogeneous  system  is  always  solvable 
for  some  x  j^  0,  assuming  t  >  1  .  Let  Qi  =  [qi, .  ■  ■  ,qt]-  By  construction, 

tr{Q^{A{x)  -Ao)Qi)  =0  =  qf{A{x)  -  Ao)qi,  t"  =  i -f  1, . . . ,  n. 

It  follows  that  QJ (^A{x)  —  Aq)Qi  is  orthogonally  similar  to  a  matrix  with  zero  diagonal 
(this  is  readily  proven  by  induction  on  n).  Let  U  define  this  similarity  transformation,  let 
Qi  =  QiU,  and  define  Q  =  [Qi,qt+i, . . .  ,qn]-  Then,  by  construction,  J*(Q)x  =  0  which 
implies  that  J*{Q)  is  singular. 

The  question  then  arises:  does  there  exist  a  direction  d  such  that  A{c*  +  ad)  has  distinct 
eigenvalues  for  small  a  >  0  and  such  that  the  condition 

Q{A{c*  +ad))  -*  Q    as    a -^  0+  (3.2) 

holds?  The  answer  is,  generically,  yes,  provided  n  >  t{t  —  l)/2.  To  obtain  such  a  vector  d, 
one  solves 

n 
J2{QlAkQi)dk  -  diag[iJii)  =  0, 
k=l 

a  system  of  t{t  +  l)/2  equations  \n  n  +  t  unknowns  d  and  {mx}i-  When  n  >  t{t  -  l)/2, 
there  is  generally  a  solution  with  {^i}\  distinct.  In  this  case  it  is  not  difficult  to  show  that 
(3.2)  holds. 

The  consequence  of  the  discussion  above  is  that  generally  there  is  a  manifold  M.  of 
dimension  less  thaji  n  for  which  J[c)  approaches  singularity  when  c  E  M.  and  c  ^^  c' .  By 
carrying  out  the  construction  described  above,  one  is  able  to  obtain  a  point  c  G  .M,  with  c 
near  c".  Interestingly  enough,  however,  even  when  Method  I  is  started  at  such  a  point,  it 
has  no  difficulty  with  convergence.  Typically  the  next  iterate  c^  is  not  in  or  near  M,  and 
subsequent  iterates  converge  quadratically  as  usual.  Because  H  has  dimension  less  than 
n,  it  is  very  unlikely  that  iterates  in  M  will  be  generated.  One  might  well  be  concerned  at 
the  possiblity  that  if  iterates  near  M.  are  generated,  convergence  could  be  slow.  However, 
one  must  remember  that  J[c)  is  not  continuous  at  c',  and  that,  for  example,  J[c)  varies 
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extremely  rapidly  on  small  spheres  centered  at  c'.  Thus  even  at  a  point  c  lying  near  M, 
and  near  c' ,  J{c)  may  be  far  from  singular. 

So  far  we  have  explained  the  behavior  of  Method  I  in  the  presence  of  multiple  eigenvalues 
without  making  any  modification  to  the  method.  In  §3.2  we  shall  see  that  when  the  problem 
is  properly  formulated,  the  method  should  be  modified.  In  that  case  J  is  replaced  by  a 
matrix  which  does  not  have  the  undesirable  property  that  its  nonsingularity  depends  on  the 
choice  of  the  eigenvectors.  The  interesting  conclusion  from  the  discussion  just  completed, 
however,  is  that  even  when  no  modifications  are  made  to  Method  I,  it  is  generally  locally 
quadratically  convergent  regardless  of  the  eigenvalue  multiplicities,  assuming  the  problem 
has  a  solution. 

Let  us  now  consider  Method  11.  Difficulties  may  arise  during  the  application  of  the  inverse 
iteration  (2.13)  due  to  the  presence  of  the  multiple  eigenvalue  AJ.  To  see  this,  let  us  write 


n 


and  thus  (2.13)  gives 

Now  suppose,  for  example,  that  A  J  is  much  closer  to  A  1(0")  than  it  is  to  A2(c''), . . . ,  At(c''). 
Then  all  the  vectors  {ni}\  will  be  nearly  parallel  to  qi  (c""*"^)  and  will  fail  to  approximate  the 
invariant  subspace.  To  avoid  this  difficulty  each  new  vector  7,,  1  <  i  <  t,  is  orthogonalized 
with  respect  to  the  earlier  computed  vectors  belonging  to  the  multiple  eigenvalue.  Thus 
we  solve 

for  r  G  R'^^"*,  where  q[''^  =  [g^, . . . ,  q^],  and  then  compute  the  "Qi?" factorization 


r^gi'^^'r, 
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where  T  is  upper  triangular.  If  the  orthogonalization  produces  a  zero  vector,  we  replace 
the  corresponding  column  of  Qi  by  a  unit  vector  ey,  trying  the  columns  of  the  identity 
matrix  in  turn  until  we  succeed  (see  Peters  and  Wilkinson  (1971)  p. 418).  The  vectors  cor- 
responding to  the  distinct  eigenvalues  are  updated  by  meajis  of  (2.13).  It  will  be  shown  in 
§3.3  that  Method  11,  using  this  implementation  of  the  inverse  iteration,  is  locally  quadrat- 
ically  convergent.  The  same  argument  given  for  Method  I  shows  that  {H-/^"^"^!!}  may  not 
be  bounded.  However,  as  in  Method  I,  this  is  very  unlikely  to  occur;  furthermore,  this 
consideration  will  disappear  when  we  modify  the  methods  in  §3.2. 

We  now  turn  our  attention  to  Method  HI.  The  diagonal  equations  (2.16)  define  c""*"^ 
just  as  Methods  I  and  11  do,  regardless  of  the  eigenvalue  multiplicities.  The  off-diagonal 
equations  (2.17)  are 

yt;(A;  -  A*)  =  [q'^f  A{c''^')q'^  1  <  z  <  j  <  n. 

The  left-hand  side  of  this  equation  is  zero  for  1  <  i  <  j  <  t  regardless  of  the  value  of  y,j, 
and  thus  yij  is  not  determined.  A  reasonable  course  to  take  is  to  set 

Vij  =0         1  <  :'  <  J  <  i. 

With  this  choice  it  will  be  shown  in  §3.3  that  the  iterates  {c^}  converge  quadratically  to  c' 
as  in  Methods  I  and  II.  Furthermore,  it  will  be  shown  that  {(5^"^'}  converges  quadratically 
to  a  limit.  It  follows  that  {J^"'}  converges.  The  remarks  made  for  Methods  I  and  II, 
concerning  the  possible  unboundedness  of  {i|'/^''^~^]|}  ,  also  apply  to  Method  III. 

Now  consider  Method  FV.  Here  the  analysis  is  trivial.  In  the  multiple  eigenvalue  cjLse, 
the  function  g{c)  remains  differentiable,  but  has  multiple  entries.  The  Jacobian  of  g  is  thus 
necessarily  singular.  It  is  clear  that  the  method  must  be  reformulated.  However,  since  we 
do  not  consider  this  method  computationally  attractive,  we  will  not  discuss  it  any  further. 
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§3.2  Modification  to  Formulation  and  Methods. 

Let  us  view  the  problem  from  a  slightly  different  perspective.  The  relation 

A{c)  =  QTQ^  (3.3) 

caji  be  considered  a  system  of  n{n  +  l)/2  equations  in  n{n  -r  l)/2  unknowns,  namely  the 
parameters  {cfc}^  and  the  orthogonal  matrix  Q  which  has  n{n  —  l)/2  degrees  of  freedom. 
However,  when  t  >  1,  5  =  i(i  —  l)/2  of  these  degrees  of  freedom  are  of  no  help  in 
solving  the  problem  (3.3),  since  they  describe  only  the  rotation  of  [gi,. . .  ,qt]  to  a  different 
choice  of  basis.  It  follows  that  when  A'  is  completely  specified  and  i  >  1,  Problem  1  is 
inherently  over  determined.  This  did  not  cause  difficulties  for  our  disciission  in  §3.1  because 
we  assumed  throughout  that  a  solution  c'  existed,  i.e.,  that  A'  was  choosen  so  that  (3.3) 
Wcis  solvable.  However,  in  practice  we  must  expect  that  if  a  multiple  eigenvalue  is  present, 
either  s  of  the  remaining  eigenvalues  are  not  specified,  or  an  additional  5  parameters  are 
available.  For  convenience  we  shall  maJte  the  former  assumption  and,  instead  of  Problem  1, 
consider 

Problem    2.    Find  the  parameters  ci,...,c„  so  that  the  n  —  s  smallest  eigenvalues  of 
A{c)  have  the  values  i 

Aj  =  ■  •  ■  —  X^  <  Af^i  <  •  •  •  <  A„_3, 

where  s  =  t{t  -  l)/2. 

It  is  clear  that  the  numerical  methods  of  §2  must  now  be  modified  since  5  of  the  rows  of 
J  have  effectively  been  removed.  Our  goal  is  to  obtain  methods  which  are  quadratically 
convergent  even  though  3  of  the  eigenvalues  are  not  specified.  Let  us  start  by  considering 
Method  HI,  since  in  this  case  it  is  rather  clear  what  should  be  done.  Consider  again  (2.15), 
with  second  order  terms  neglected, 

(Q(''')^A(c)Q('')  =  a-  +  Y\'  -  \'Y,  (3.4) 

which  appears  to  represent  n{n  -h  l)/2  equations  in  the  n{n  ~  l)/2  unknowns  c  and  Y. 
However  5  =  i(i  -  l)/2  of  the  y.y,  namely  those  for  which  1  <  i  <  j  <t,  are  of  no  help  in 
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solving  (3.4),  and  may  be  removed  from  the  equation,  since  they  are  multiplied  by  zero  on 
the  right-hand  side  of  (3.4).  Thus  we  see  again  that  it  is  appropriate  to  specify  only  n  —  s 
eigenvalues,  and  so  we  replace  A*  in  (3.4)  by  A  =  diag{Xt)  where  A,  =  A*,  i  =  1, . . .  ,n  —  s, 
and  where  the  last  5  entries  {A,}"_3^i  are  free  parameters.  Equation  (3.4)  then  has 
n{n  +  l)/2  unknowns,  namely  n  parameters  {ck},  n{n  —  l)/2  —  5  parameters  {utj},  and 
5  parameters  {At}^_3+i  To  solve  this  equation  we  separate  the  computations  of  c  and  Y, 
as  before.  There  are  n  equations  defining  c  alone,  namely 

E((<)''^^?n<^'  =  K  -  (^D^'^o^r  i  =  l,...,n~s  (3.5) 

Jfe=l 

and 

i2ii^if^>^^:K^'  =  -i'^n^'Aog^         l<i<J<t.  (3.6) 

k=l 

The  equations  (3.6)  were  not  previously  imposed  by  Method  III;  they  were  not  needed 
since  we  had  assumed  existence  of  a  solution  to  an  overdetermined  problem.  We  denote 
the  combined  system  (3.5),  (3.6)  by 

if  ("^c^^^  =  /i".  (3.7) 

Having  thus  defined  c^"*"^,  the  remaining  unknowns  {vij}  and  {A,}  are  defined  by 

A,  =  {q!^f  A{c''+')q^  n-s<i<n  (3.8) 

ya  =  -        -    ^  1  <  :  <  y  <  n,      j  >t.  (3.9) 

Ay  —  A, 

Finally,  we  set 

y,y  =  0         1  <  I  <  i  <  i  (3.10) 

since  these  parameters  describe  the  rotation  of  the  eigenvectors  corresponding  to  the  mul- 
tiple eigenvalue,  and  therefore  can  be  set  to  zero.  The  convergence  analysis  of  the  next 
section  will  show  that  zero  is,  in  fact,  the  best  choice  for  these  parameters. 
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One  difficulty  remains:  suppose  that  at  the  solution  one  of  the  eigenvalues  that  was  not 
specified  is  actually  multiple,  i.e.,  for  some  j  >  n  —  s,  A_,(c')  =  At(c*),  with  i  =  j  ±  1. 
Then  if  the  modified  method  is  applied  we  will  normally  have  that  |y,-y|  — ♦  oo  as  c"  — *  c', 
since  {Ay  —  A,}  will  generally  converge  to  zero.  This  can  be  avoided  by  introducing  a 
tolerance  parameter  and  setting  y,j  to  zero  if  \Xj  —  Xi\  drops  below  this  parameter. 

A  more  formal  description  of  the  modifed  Method  III  will  be  given  below.  Let  us  first  go 
back  to  Method  I  and  modify  it  so  that  it  solves  Problem  2.  To  compute  the  new  estimate 
of  the  parajneters,  the  n  —  s  equations  (3.5)  for  the  distinct  eigenvalues  are  combined  with 
the  s  equations  (3.6)  for  the  multiple  eigenvalue  Aj,  to  give  a  system  identical  to  (3.7), 
except  that  Q  refers  to  the  computed  eigenvectors  Q{c),  rather  than  the  approximations 
updated  by  Method  m.  With  hindsight  we  can  show  that  this  is  asymptotically  equivalent 
to  applying  Newton's  method  to  a  reformulation  of  (2.1).  Note  that  (3.5)  and  (3.6)  can 
be  written  as 

{q[^Ya{c^^')Q[''^  =  X[I^ 

(3.11) 

{q'^f  A{c''^')qlf  =  \-  i  =  t  +  l,...,n-s, 

where  Q^     =  [gjf,. . .  ,qt\.  Consider  the  Newton  iteration  on  the  nonlinear  system 

h{c)  =  Qi{cfA{c)Qi{c)-XlIt  =  0 

(3.12) 
(/2(c)),  =  q,{c)^  A{c)qi{c)  -  X-  =  0         i  =  t  +  I, . . .  ,n  -  s, 

where  Qi{c)  =  [qi{c),. . .  ,qt{c)]  .  Note  that  /i  represents  t{t  -r  l)/2  equations  and  /2 
consists  of  some  of  the  components  of  /  in  (2.1).  Difi"erentiating  /i  with  respect  to  c,  one 
obtains 

/i(c)  =  Qi{cfA{c)Q,{c)  +  Qi{cfA{c)Qi{c)  +  Q,{cf  A{c)Q,{c).  (3.13) 

Differentiating  Qi[c)'^Qi[c)  =  I  and  using  the  relation  A(c)  =  Q{c)\{c)Q{c)'^ ,  (3.13) 
simplifies  to  become 

/i(c)  =  QxicfA{c)Qi{c)  +  BXt[c)  -  \t{c)B,  (3.14) 
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where  B  =  (5i(c)^Qi(c)  and  At(c)  =  diag[\i[c), . . .  ,\t{c)).  Assuming  that  B  remains 
bounded,  the  last  two  terms  in  the  right-hand  side  of  (3.14)  cancel  in  the  limit  since 
At(c)  — »■  \\It-  Thus  from  (2.4)  and  (3.14)  we  see  that  (3.11)  is  essentially  a  Newton  step 
on  (3.12).  Although  a  true  Newton  step  would  include  the  last  two  terms  on  the  right-hand 
side  of  (3.14),  quadratic  convergence  is  not  impeded  by  dropping  them.  To  prove  quadratic 
convergence,  one  must  taJie  into  eiccount  the  lack  of  continuity  of  Q\[c),  and  this  can  be 
done  by  applying  Rellich's  theorem  as  discussed  in  §3.1.  However,  in  the  next  section  we 
will  present  a  more  direct  proof. 

Let  us  now  modify  Method  11.  Using  the  implementation  of  inverse  iteration  described 
in  §3.1  we  compute  approximations  {q'(,...,q'^_g)  to  the  eigenvectors  corresponding  to 
the  n  —  s  given  eigenvalues  {A*}"~^.  The  new  estimate  of  the  parameters  is  obtained,  as 
in  Methods  I  and  HI,  by  solving  (3.7),  where  the  gf  are  the  vectors  obtained  by  inverse 
iteration. 

Thus  the  Modified  Methods  I,  11  and  III  have  the  same  form,  with  differences  only  in 
the  way  of  computing  the  q" .  We  will  obtain  quadratic  convergence  for  the  three  methods 
if  we  assume  that  the  matrix  K  defined  by  (3.5)-(3.7),  with  {gf }  replaced  by  a  set  of 
eigenvectors  of  ^(c*),  is  nonsingular.  Note  that  this  condition  is  independent  of  the  choice 
of  the  basis  Qi(c*),  a  much  more  satisfactory  situation  than  in  §3.1. 

We  now  describe  in  detail  the  modified  versions  of  Methods  I,  II  and  III  designed  for 
solving  Problem  2. 
Modified  Method  I 

Choose  c°.  Form  A[c^)  and  compute  its  eigenvalues  and  eigenvectors. 
For  :/  =  0,1,2,... 

1.  Stop  if  [  Yl  (^.(c'')  -  A*)^]  ^  is  sufficiently  small. 

1=1 

2.  Form  K^"^  and  h""  (see  (3.5)-(3.7)),  using  g,"  =  g,(c'').    Compute  c^+i  by  solving 
(3.7).  Form  A(c''+i)  . 

3.  Find  the  eigenvalues  {A,(c''+^)}  and  eigenvectors  {g,(c''+^)}  of  Aic"^^).    (Actually 
only  the  first  n  —  s  eigenvalues  and  eigenvectors  are  needed.) 
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Modified  Method  11 

Choose  c°,  form  A{c°)  and  compute  its  matrix  of  eigenvectors  Q(c°).  Set 
g(o)  <_  Q(c°).  (As  in  Method  I  only  the  first  n  —  s  eigenvectors  are  needed.) 
Fort/  =  0, 1,2, ...        .    . 
1.  Stop  if 


){")  \'^  Ai.^^ni^) 


UQl'lsY  McnQl'ls  -  K- 


a\\F 


is  sufficiently  small,  where  qI^^}^  =  [qi,---  ^Qn-sl  ^^'^  K-s  =  a!iasr(Aj, . . . ,  A;_  J. 

2.  Formii'('')  and /i"  (see  (3.5)-(3. 7)),  and  compute  c^^^  by  solving  (3.7).  Form  ^(c^^^). 

3.  Compute  the  factorization 

A{c''+')  =  UNU'^, 
where  U  is  orthogonal  and  N  is  tridiagonal.  Solve 

[iV-At7](C7^r)  =  ?7^gS''^ 
for  r  G  R^^"*,  where  qJ"^  =  {q'(, . . .  ,q!^],  and  compute  the  "Qi2" factorization 

r  =  Qi'^'^T, 

where  T  is  upper  triangular  (if  necessary  modify  <5i    ,  ^  described  in  §3.1,  to  ensure 
that  T  has  full  column  rank).  Next  solve 

[N  -  XU]{U^1^)  =  U^qi  i  =  t  +  l,...,n-s 

and  compute 

*     =M       .  =  .  +  1,...,-.. 

Modified  Method  III 

Choose  c^°^  and  set  Q^°'  <—  Q{c°].  Choose  a  tolerance  parameter  Neglig. 
For  1/  =  0, 1,2, ... 
1.  Stop  if 
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is  sufficiently  small. 

2.  Form/C^'")  and /i"  (see  (3.5)-(3.7)),  and  compute  0"+^  by  solving  (3.7).  Form  A (0"+^). 

3.  For  I  <  i  <  j  <  n,  compute 

y,y  =  \   fZJ. '^  1^*  -  ^;l  >  ^^9l^9 

0  otherwise 


where 


a;  if  1  <  t  <  n  -  s 

{q!f)^  A{c''+^)q!^         otherwise. 


■^i  ~    1    'li^^T  A(^v+l\„v 


4.  Compute  H  =  [I  -  ^Y)Q^''\  factorize  (/  +  \Y)  and  use  this  to  solve  the  n  linear 
systems 

(/+  -y)vi  =  hi         I  =  1,...  ,n, 

where  hi  is  the  z*^  column  of  H,  ajid  set 

To  our  knowledge,  the  modified  methods  are  new.  Indeed,  we  are  not  aware  of  any  dis- 
cussion of  the  correct  general  formulation  of  the  inverse  eigenvalue  problem  in  the  multiple 
eigenvalue  case,  where  the  dimension  of  the  parameter  space  is  chosen  as  in  Problem  2. 
However,  the  principles  on  which  these  ideas  rest  are  well  known  from  the  perturbation 
theory  of  multiple  eigenvalues;  see  Davis  and  Kahan  (1970),  Friedland  (1978)  or  Kato 
(1966).  The  dimension  argument  is  essentially  the  same  as  the  phenomenon  known  in 
quantum  mechanics  as  the  "crossing  rule"  of  Von  Neuman  and  Wigner  (1929);  see  also 
Friedland, Robbin  and  Sylvester (1984). 

The  fact  that  the  number  of  parameters  must  be  increased  in  the  multiple  eigenvalue 
case  is  well  known  in  the  structural  engineering  literature;  see  Olhoff  and  Taylor  (1983, 
p.  1147).  The  dimension  argument  has  also  been  discussed  in  connection  with  the  problem 
of  communality  (see  Harman  (1967))  and  the  educational  testing  problem  (see  Fletcher 
(1985)). 

Since  the  problem  of  communality,  described  in  §1.1,  is  an  important  special  case  that 
has  received  much  attention,  we  will  discuss  it  in  more  detail. 
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If  the  minimum  rank  is  known  to  be  n  —  t  then  t  of  the  eigenvalues  A'  are  zero,  but 
no  other  eigenvalues  axe  given.  Following  our  remarks  at  the  beginning  of  this  section, 
we  know  that  the  problem  will  be  well  posed  in  general  if  n  —  5  eigenvalues  are  specified, 
where,  as  before,  s  =  t{t  —  l)/2.  Thus  the  problem  of  communality  will  be  well  posed  if 
n  —  s  =  t,  OT 

tit  +  1)  ,        s 

n  =  ^^.  (3.15) 

This  equation  is  solvable  only  for  certain  values  of  n.  In  particular,  when  n  =  6  we  can 
expect  to  be  able  to  set  t  =  3,  and  the  modified  methods  will  be  locally  quadratically 
convergent.  However,  for  n  =  7  there  is  no  value  of  t  that  will  satisfy  (3.15).  When 
i  =  3  the  problem  is  underdetermined,  and  a  natural  course  to  take  is  to  consider  instead 
a  minimization  problem  subject  to  the  constraints  AJ  =  Aj  =  A3  =  0.  The  objective 
function  could  be,  for  example,  the  sum  of  the  diagonal  elements  of  the  matrix.  The 
formula  (3.15)  is  well  known  and  can  be  derived  in  several  different  ways;  see  Harman 
(1967,  pp.  69-70)  and  Fletcher  (1985,  p.  502). 

HarmcLn  describes  various  methods  for  solving  the  communality  problem  and  other  re- 
lated problems,  but  none  of  them  seems  to  be  quadratically  convergent.  To  our  knowl- 
edge, the  only  algorithm  for  solving  general  minimal  rank  problems,  which  is  known  to  be 
quadratically  convergent,  is  that  of  Fletcher  (1985).  The  spirit  of  Fletcher's  algorithm  is 
similar  to  that  of  our  modified  methods,  since  it  makes  the  correct  count  of  the  number 
of  equations  and  is  also  related  to  Newton's  method.  The  algorithm  is  based  on  differen- 
tiating the  block  Cholesky  factor  corresponding  to  the  null  space  of  Aq  -r  D;  it  is  actually 
derived  for  problems  where  Aq  +  D  is  constrained  to  be  positive  definite.  The  method 
does  not  coincide  with  any  of  our  modified  methods,  and  it  does  not  seem  to  be  possible 
to  generalize  it  to  handle  other  inverse  eigenvalue  problems. 

There  are  various  numerical  methods,  especially  in  the  engineering  literature,  that  are 
designed  to  handle  optimization  problems  where  multiple  eigenvalues  arise,  either  in  the 
objective  function  or  in  the  constraints  (see  for  example  Polak  and  Wardi  (1982)  and 
Cullum,  Donath  and  Wolfe  (1975)).     Most  of  these  are  first-order  methods,  i.e.     they 
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are  not  quadratically  convergent.    Choi  and  Haug  (1981),  however,  give  a  quadratically 
convergent  method  for  solving  a  specific  design  problem  involving  one  double  eigenvalue. 
We  complete  this  section  with  a  discussion  of  local  existence  and  uniqueness  results  for 
Problem  2.  In  the  following  theorem  we  will  consider  small  perturbations  which  preserve 
eigenvalue  multiplicities. 

Theorem  3.1.  Assume  that  c*  is  a  solution  to  Problem  2.  Suppose  that  K{c'),  de- 
fined by  (3.5)-(3.7)  replacing  {q^}  by  any  orthonormal  set  of  eigenvectors  of  A{c')  ,is 
nonsingular.  Then  there  exists  e  >  0  such  that,  for  all  {m,*}"~'  satisfying 

mI  =  ••■  =  Ht  <  Mt+i  <  •  •  •  <  fJ-n-s 
and 

IMi*  ~  -^i  I  <  f         1  <  t  <  n  —  3, 
Problem  2,  with  {A*}p*  replaced  by  {m,*}?~^,  has  a  locally  unique  solution  d{n*)  with 

||c*-ci(M*)||    =0{e). 
PROOF:   Solving  the  perturbed  problem  is  equivalent  to  solving 

F{d,X,M*)  =  e-^Q^A{d)Qe^  -  M'  =  0, 

where  Q  is  an  orthogonal  matrix  of  eigenvectors  of  A{c'),  M'  =  diag{iJ.^)  and  X  is  a 
skew-symmetric  matrix  with  the  restriction  that  Xij  =0  for  1  <  i  <  j  <  t.  When 
fj,l,. . .  ,^j,^_g  are  fixed,  this  is  an  analytic  system  of  n{n  +  l)/2  equations  in  n{n  -i-  l)/2 
unknowns,  since  3  of  the  {xij}  are  set  to  zero  and  3  of  the  {/x,'}  are  free.  By  expanding  e-^ 
and  neglecting  second-order  terms,  one  sees  that  the  Jacobian  of  F  with  respect  to  d,  X 
and  {;i,'}"_3+i  is  nonsingular  at  d  =  c' ,  X  =  0  and  M*  =  A'  if  and  only  if  K{c')  is 
nonsingular.  The  result  therefore  follows  from  the  implicit  function  theorem;  see  Ortega 
and  Rheinboldt  (1970,  p.l28). 
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This  theorem  shows  that  Problem  2  is  numerically  well  posed.  If  the  perturbation  does 
not  preserve  multiplicities,  the  perturbed  problem  is  in  general  ill  posed.  In  particular, 
some  of  the  components  Xtj  may  change  from  zero  to  arbitrarily  large  values. 

§3.3  Convergence  Analysis. 

In  this  section  we  present  convergence  results  for  Methods  I,  II  and  III.  For  convenience, 
we  first  assume  that  all  the  eigenvalues 

a;  =  •••  =  a;  <A,Vi  <  •••<a;  (s.ie) 

cire  specified,  and  analyze  the  methods  in  their  unmodified  form.  When  i  >  1,  this  means 
that  the  problem  is  overdetermined  as  already  explained,  but  this  causes  no  difficulty  for 
the  convergence  analysis  since  we  assume  existence  of  a  solution.  Afterwards,  we  explain 
how  to  adapt  the  proofs  to  apply  to  the  modified  methods,  when  only  AJ,...,AJ^_3  are 
specified. 

To  start,  we  make  the  following  assumptions. 
Assumption  3.1 

(i)   There  exists  c*  such  that  A{c')  has  eigenvalues  given  by  (3.16). 
(ii)   The  matrices  J{c'^)   used  in  Method  I  satisfy  limsup{!| 7(0^)"^ j|}    <   oo   ,  and  the 

matrices  J^")  of  Methods  II  and  III  satisfy  limsup{||  J^''^''  ||}  <  co. 

1/— .CO 

When  we  analyze  the  modified  methods,  the  second  part,  namely  (ii),  will  be  replaced 
by  an  assumption  on  the  nonsingularity  of  K[c'). 

We  will  now  present  several  preliminary  results  that  will  be  needed  for  the  convergence 
proofs.  Let  pi,. .  .  ,pt  be  any  orthonormal  set  of  eigenvectors  of  A{c')  corresponding  to 
the  multiple  eigenvalue  Aj,  and  let  Pi  =  [pi  . . . pt\.  Denote  the  eigenprojection  of  A{c') 
for  Aj  by  II;  we  have 

Jl  =  P,P^.  (3.17) 
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Let  P2  =  [pt+i---Pn]  be  the  matrix  of.  orthonormal  eigenvectors  corresponding  to  Aj^^, 
. . . ,  A;^.  Given  any  orthogonal  matrix  Q  =  [Qi  (52], -consider  the  problem  of  constructing 
aji  orthogonal  matrix  of  eigenvectors  of  yl(c'), 


P  =  [Pi  P2I 


(3.18) 


which  is,  in  some  sense,  close  to  Q.  Note  that  there  is  freedom  only  in  the  way  Pi  is 
chosen;  specifically  Pi  is  of  the  form  PiB,  where  B  is  an  orthogonal  matrix  of  order  t.  To 
find  Pi  we  start  by  considering  the  matrix  TlQi  whose  columns  are  eigenvectors  for  Aj, 
but  are  not  orthonormal.  Then  we  form  the  "QP"  factorization  of  TLQi: 


nQi  =  PiP, 


(3.19) 


where  R  is  a.  t  x  t  nonsingular  upper  triangular  matrix  and  Pi  is  an  n  x  i  matrix  whose 
columns  axe  orthonormal.  Clearly  the  columns  of  Pi  are  eigenvectors  of  A{c').  Let  us 
define  the  error  matrices 


(3.20) 


El  =  Qi  -UQi 

E2  =  Qi  —  P2- 

Lemma  3.1.  Let  P  =  [Pi  Po]  he  an  orthogonal  matrix  of  eigenvectors  of  Ale").  Then 
there  exist  constants  e  >  0  ajid  C  >  0  such  that,  for  any  orthogonal  matrix  Q  =  [Qi  Q2] 
with  \\Ei\\  <  e,  the  matrix  P  de&ned  by  (3.17)  and  (3.19)  satisfies 


Tp  _ 


g^p  = 


wiiere  1|P||  <  C\\Ei\\-. 


PROOF: 


Pi    P2 


I-F       -QIE2 
E2  Pi     I  —  Qo  E2 


Q\P2 

=  Qj{Q2 

-  E2) 

=  -QIE2 

QlPi 

=  {E2^P2fPl 

=  E^Pi 

Q2P2 

=  Ql{Q2 

-E2) 
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=  I -QIE2 

Let  us  now  consider  the  remaining  block.  Since  IIPi  =  Pi  we  obtain  from  (3.19) 

R  =  P^UQi  =  P^Qi  (3.21) 


and 


R''R={nQif{IlQi) 

=  {Qx-Eif{Qx-Ei)  (3.22) 

=  I-Q'[Ei-E'[Qi+EfEi. 


Now 


EfQ,  =  El{Ei  +  UQi)  =  EjEu 


and  therefore  (3.22)  gives 


it    R  —  /  —  E 1  El . 
By  doing  a  Cholesky  factorization  to  obtain  R  we  see  that,  provided  e  is  small  enough, 

R^  I  -  F  (3.23) 

where  \\F\\  <  CWEiW^.  The  result  then  follows  from  (3.21). 

Corollary    S.l.    There  exist  constajits  C  >  0,    e  >  0  such  that,  for  ajiy  orthogona.} 
matrix  Q  with  \\E\\  =  \\[Ei  E2]\\  <  £,  the  skew-symmetric  matrix  X  defined  by 

e^  =  Q^P 
satisGes 

m  <  c\\E\\ 

WXiiW  <C\\E\\\ 
Here  Xn  is  the  t  x  t  leading  block  of  X. 

PROOF:    It  follows  immediately  from  Lemma  3.1  since  e^^  =  I  +  X  +  0{\\X\\^). 
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These  results  show  that,  given  an  approximate  matrix  of  eigenvectors  Q,  the  eigenvectors 
at  the  solution  may  be  chosen  in  the  form  P,  so  that  X,  which  describes  to  first  order  the 
rotation  from  the  bzisis  Q  to  the  basis  P,  is  0(||£'j|),  and  furthermore  Xn,  which  describes 
the  rotation  of  Qi  to  Pi,  is  0(||£'|p).  This  last  fact  will  be  needed  for  the  analysis  of 
Method  ni. 

For  the  following  Theorem  we  define  the  error  matrix 

E{c)  =  [E,{c)  E2{c)\  =  [Q,{c)-nQi{c)     Q2{c)  -  P2]. 

Since  (/  —  n)Q(c)  is  a  Lipschitz  continuous  function  of  c  (see  for  example  Kato(1966)), 
E{c)  is  also  Lipschitz  continuous  and 

P(c)!|  <L||c-c'||  (3.24) 

holds  for  all  c  near  c*,  where  L  is  some  constajit. 

Theorem  3.1.  There  exists  e  >  0  such  that,  if  ||c°  -  c'\\  <  e,  the  iterates  {c"}  of 
Method  I  converge  quadratically  to  c*. 

PROOF:  Let  c  =  c" ,c  =  c^+^Q  =  Q{c)  and  define  P  by  (3.17)-(3.19)  where  P  =  [Pi  P2] 
is  cLny  orthogonal  matrix  of  eigenvectors  of  A{c*).  Define  X  by  e"^  =  Q^ P.  From  Corollary 
3.1,  ||X||  =  0(||£'||).  Thus  from  (3.24)  we  see  that  ||i;||  and  ||X||  can  be  made  as  small  as 
we  like  by  making  \\c  —  c'\\  small  enough.  Since  P  is  a  matrix  of  eigenvectors  of  A{c'),  we 
have 


e-^A'e-^  =  Q'A{c')Q 


and  thus 


A*  +  XA'  -  A'X  =  Q^A{c-)Q  +  0{\\X\\').  (3.25) 

The  diagonal  equations  of  (3.25)  are 

X',=gJ'A{c-)q,  +  0{\\Xf)  i  =  l,...,n, 


and  therefore 

.    X'  =  J{c)c-  ^b{c)^0{\\Xf).  (3.26) 

The  new  iterate  c  is  defined  by 

A*  =  J{c)c^b{c) 
and  so  it  follows  that 

J{c){c-c')  =  0{\\X\\'). 
Finally,  by  the  nonsingularity  assumption  (see  (i)  in  Assumption  (3.1)), 

\\-c-c'\\=0{\\Xf) 

=  0{\\c-c'f). 

Note  that  there  is  no  need  to  invoke  Rellich's  theorem,  as  is  done  in  the  proof  of  Nocedal 
and  Overton  (1983).  Instead  we  have  made  use  of  the  fact  that  the  eigenprojection,  and 
consequently  E{c),  are  Lipschitz  continuous. 

For  the  convergence  proofs  of  Methods  II  and  HI  we  define  the  error  matrix  £'("'  by 

(3.27) 

Since  all  eigenvalues,  and  the  eigenvectors  corresponding  to  distinct  eigenvalues,  are  Lips- 
chitz continuous  functions  in  a  neighborhood  of  c',  there  exist  constants  Ni  and  N2  such 
that 

||A(c)-A-j|<.Vi||c-c-||  (3.28) 

and 
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||Q2(c)  -P2II  <^2||c-c*||  (3.29) 

hold  for  all  c  near  c' .  Let  us  define 

5=     min    {|A*-A;|}. 
Then  for  i  j^  k,  k  >  t,  we  have 

|Afe(c)-A:|>|A;-A;i-iAfc(c)-A:| 

(3.30) 
>6-Ni\\c-c'\\. 

Since  the  inverse  iteration  (2.13)  is  not  defined  when  an  eigenvalue  Ai(c'')  coincides  with 
a  target  eigenvalue,  we  will  assume  that,  for  all  u  and  for  all  I  <  i  ,j  <  n  ,  A,(c'')  7^  A*. 
This  is  not  a  restriction  in  practice,  since  it  is  known  that  inverse  iteration  will  work  even 
when  a  target  eigenvalue  coincides  with  an  eigenvalue  ^i{c^)  to  machine  accuracy;  see 
Peters  and  Wilkinson  (1971). 

Theorem  3.2.  There  exists  e  >  0  such  that,  if  ||c°  -c*\\  <  e,  the  iterates  {c"}  generated 
by  Method  II  converge  quadratically  to  c' . 

PROOF:   Let  Q  =  Q(''),   Q  =  (Q^^+i),   E  =  E^'^K   E  =  ^jf-^+D  and  define  P  by  (3.17)- 
(3.19)  with  Q  =  Q'^'^K  Let  the  skew-symmetric  matrix  X  be  defined  by  e^  =  Q'^P.   By, 
Corollary  3.1,  ||J\r||  =  C>(||i;'||).  Following  the  same  reasoning  as  in  the  proof  of  Theorem  3.1 
we  see  that  (3.25)  holds,  and  its  diagonal  equations  give 

A*  =  Jc'+ 6  + 0(||Xf),  (3.31) 

where  J  =  J^''^  and  b  =  h" .  The  new  iterate  c  of  Method  II  is  denned  by 

y  =  Jc  +  h, 
and  thus 

J[c-c')  =  0(ilX||-). 
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By  the  nonsingularity  assumption  on  J 


\\c-c'\\=0[\\E\\% 
From  (3.24)    ||£'^°^||  <  I^ljc^  -  c*||.  Let  us  assume  inductively  that 


(3.32) 


l^ll=0(||c-c*|l), 


(3.33) 


aJid  thus,  if  we  caji  show  that 


|i;il  =  o(i|c-c*||), 


(3.34) 


then  interlacing  (3.32)  and  (3.34)  completes  the  proof.  We  analyse  the  two  components  of 
the  error,  Ei  and  E2  separately.  Note  that  while  Q  =  [Qi  qt+i  ■  ■■qn\  denotes  the  matrix 
iterate  Q^^\  Q[c)  —  [Qi{c)  Q2[c)\  =  [Q\[<^)  <7t+i(c)  •  •  ■  9n(c)l  denotes  the  eigenvectors  of 
A[c). 

Part  I  (bound  on  ||^i||): 
From  (3.32)  ajid  (3.33)  we  have 


\c-c'\\  =0(|lc-c*f). 


(3.35) 


Let 


U  = 


U2 


=  Q(3)^gi 


so  that 


Ql  =Qi{c)Ui+Q2{c)U2, 


(3.36) 


where  Ui  is  a  i  x  i  matrix  and  U2  is  a.n  [n  —  t)  x  t  matrix.  Moreover,  Ui  is  invertible,  as 
the  following  argument  shows.  We  have 
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where 


Taking  norms 


U^Ui 


=  QjQi{c)Qi{cfQi 

=  Ql[I-Q2{c)Q2{cf]Qi 

=  /  -.  F^F, 


(3.37) 


F^Q2{cyQi 


\\F\\<\\Q2{-cf{I-n)Q^\\  +  \\Q2{cfUQ4 

<  11(7 -  n)gi||  +  ||Q2(c)  -  P2||||ngi||  +  \\p^nQi\\. 

From  (3.33),  (3.29)  ajid  (3.35),  and  since  the  last  term  is  zero,  we  conclude  that 

||F||  =  0(||c-c'li).  (3.38) 

Therefore  UfUi  =  I  +  0{\\c  —  c*||^).  Provided  \\c  -  c*||  is  small  enough  V^Ui,  and  hence 
U\,  are  nonsingular.  Let  Mi  be  a  constant  such  that 


\\U^'\\<M,. 
The  vectors  corresponding  to  the  multiple  eigenvalue  are  updated  by 


(3.39) 


(3.40) 


[A{c)-\\I\V  =  Qi 

To  simplify  the  analysis  we  will  assume  that  T  is  nonsingular,  i.e.,  there  is  no  need  to 
replace  columns  of  Qi  by  columns  of  the  identity  matrix  (see  the  discussion  in  Section  3.1). 
From  (3.36)  and  (3.40) 


=      Qi{c)      Q2{C) 


T  =  [A{c)-XlI\-'Qi 

=  Q{c)[A{c)-X{I\-'Q{cfQi 

(  [Ai{c)  -  XII]-'  0 

[  0  [A2(c)-Al/l-i 


(3.41) 
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where  Ai  =  diag{Xi, . . . ,  Xt)  and  A2  =  diag{\t+i, . . . ,  A„).  Therefore, 

r  =  gi(c)[Ai(c)  -  xii\-'Uy  +  g2(c)[A2(c)  -  x[i\-'U2.  (3.42) 

From  (3.40)  it  follows  that  T'^T  =  F^F,  and  thus  (3.42)  gives 


(3.43) 


=  U^[A,{c)  -  XII]-\I  +  B'' B)[\,{c)  -  XUr'Uu 
where 

B  =  [A2(c)  -  XlI]-'U2U-'[A,{c)  -  a;/]. 
Using  (3.28),  (3.39)  and  (3.30)  we  obtain 


ll^ll  <  MiNiWc  -  c'\\/{6  -  N,\\c  -  c'W). 

Provided  j|c  —  c*||  is  small  enough,  the  denominator  in  this  relation  is  bounded  away  from 
zero  and  we  can  write 

l|5||=0(|lc-c*||). 

Consider  the  Cholesky  factorization  of  {I  +  B^ B),  i.e.,  let  R  be  an  upper  triangular  matrix 
such  that 

R'^R  =  {I  +  B^B).  (3.44) 

As  with  (3.23)  it  follows  that 

R  =  l  +  0{\\c-c'f).  (3.45) 

We  now  substitute  (3.44)  into  (3.43)  to  obtain 

T  =  R[Ai{c)  -XiI]-^Ui.  (3.46) 
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Thus  from  (3.28)  and  (3.45) 


(3.47) 

=  0(\\c-c'\\). 


Now 


{A{c)  -  A-7]7i  =  qi 
Qi  =  li/\\li\\,  i  =  t  +  l,...,n. 

In  what  follows  i  denotes  an  integer  with  t  <  i  <  n.  We  can  write 


k=l 

for  some  scalars  {ak}.  From  (3.51)  we  have 


n 


^  Afc(c)  -  A- 
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R  =  TU-^[^i{c)-\\I].  (3.48) 

We  can  now  estimate  the  error  Ex.   From  (3.40)   Qi  =  rr~^  and  thus  from  (3.42)  and 
(3.48)  we  obtain 

\\{I  -Tl)Q,\\  <\\{I  -Tl)Qx{c){K,{c)  -  \\I\-'U,T-'\\ 

<||(j-n)gi(c)||iii?-^|| 

+  ||(A2(c)-AI7)-^||i|r-^||. 
Finally,  from  (3.24),  (3.45),  (3.30)  and  (3.47)  we  conclude  that 

11^:11  =  11(1 -n)Qi|l  =  0(||c-c*|l).  (3.50) 

Part  11  (bound  on  ||!^2||) 

The  vectors  {g,}  corresponding  to  the  distinct  eigenvalues  are  updated  by 


(3.51) 


qi  =  ^akqk[c),  (3.52) 


^.=i:Tn^VTT«p)-  (^-^s) 


We  will  now  show  that  if  ||c  —  c'\\  is  small  enough,  a,-  r  0.  The  inequality 

Ik.  -  q^m  <  Ik.  -  9.(c')||  +  \\qi{c)  -  g.(c-)||, 
together  with  (3.33)  and  (3.29),  gives 

|k,-9,(c)||=0(||c-c-||). 
(Note  that  qi{c')  is  one  of  the  columns  of  Po).  Thus 


(3.54) 


Y,  C.I  +  {a.  -  if  =  hi  -  gdm' 

=  0{\\c-c'\\'), 


Jk=l 

kjii 


cind  consequently 


a. -l  +  0(||c-c-||).  (3.55) 

We  will  now  show  that  g,(c)  and  ?,■  are  nearly  parallel.  From  (3.51)  and  (3.53)  we  have 


?.W^9. 


a,- 


l'r.||(A.(3)-A,") 


E 


,,^^fec.i(M.)-A:); 

\t^,  aux,{c)  -  x:y 

Using  (3.28),  (3.30)  and  (3.55)  we  obtain 

?v(c)^g,  =  l  +  0(||c-c*f). 
We  can  now  estimate  E2,  componentwise.  Equations  (3.56)  and  (3.29)  give 

\\q,  -  q,{c')f  =  2[l  -  q^ic'fq,] 

=  2[l  -  q^{cfq,  +  (g.(c)  -  q,{c')fq, 

=  0{\\c-c'\\). 
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(3.56) 


(3.5' 


The  proof  is  completed  by  combining  (3.50)  and  (3.57). 

Theorem  3.3.  Tiiere  exists  e  >  0  such  that,  if  ||i;(°)||  <  e,  then  the  norms  of  the  error 
matrices  {||£'(''^||}  of  Method  III  converge  quadratically  to  zero. 

PROOF:  Let  Q  =  Q^^\  Q  =  Q^'^+^K  E  =  E^''\  E  =  E^^+^^;  define  P  by  (3.17)-(3.19) 
with  Q  =  g^''^  and  define  X  by  e^  =  Q'^P.  As  for  Methods  I  and  11  we  obtain  (3.25), 
provided  ||i^||  is  small  enough.  Moreover,  from  Corollary  3.1  we  have  both 


^  =  o(||i:!|) 

X,,=0{\\Ef). 
The  matrix  Y  and  vector  c  of  Method  III  are  defined  by 


K  -  K 


(3.58) 


A*  +  FA*  -  A*y  -  C5^A(c)Q  (3.59) 

and  I'll  =  0»  i-e. 

Vi,  =0         l<i<i  <t. 
Subtracting  (3.59)  from  (3.25)  we  get 

(X  -  Y)k*  -k\X-Y)  =  Q'^{A[c')  -  A{c))Q  +  0{\\X\\^).  (3.60) 

The  diagonal  equations  give 

J(c-c-)  =  0(l|.Yf), 
where  J  =  J^"^.  By  the  nonsingularity  assumption  and  (3.58)  we  have 

\\c-c'\\=0{\\Ef).  (3.61) 

The  off-diagonal  equations  of  (3.60)  give,  for  i  >  j,   j  >  t 

X,;  -  y.;  =  TT^TlIiMcl  -  A{c))q,  -  0{\\X\\'). 
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It  follows  that 

\xij  -  y.yl  =  0{\\E\\-)         t  <  i  <  j.  (3.62) 

Since  Xu  =  0(||i;|p)  and  Yn  =  0  by  definition  of  the  algorithm,  (3.62)  holds  also  for 
i  <  j  <  t.  Therefore 

\\X-Y\\  =  0{\\Er')  (3.63) 

and  consequently,  from  (3.58) 

\\Y\\  =  0{\\E\\).  (3.64) 

Let  us  now  look  at  the  updated  matrix 

Q  =  g(/+iy)(7-iy)-^ 

We  have 

Q-P  =  Q[{I+  k,Y){I  -  iy)-^  -  e^] 

=  Q[{i+kY)  -  (/  +  x  +  o(i|xf ))(/-  iy)](7-  |y)-^ 

=  Q[Y-X+  0{\\XY\\  +  ||Xf )]  (/  -  iy)-i. 
Thus  from  (3.63)  and  (3.64) 

\\Q-P\\  =  0{\\Ef).  (3.65) 

Since  {I  —  Tl)Pi  =0  we  have 

Ei  =  {I-  Tl)Q, 

=  0{\\Ef') 
and 
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E2  =  Q2-  P2 

=  0{\\E\\'). 

The  proof  indicates  clearly  why  it  is  appropriate  to  set  Yn  =0,  namely  to  obtain  (3.62). 
It  also  follows  that  Yn  can  have  any  value  satisfying  Yn  =  0(||i;'|p).  It  is  easy  to  modify 
this  proof  so  as  to  show  that  the  parameters  {c"}  converge  quadratically  to  c*.  However, 
we  have  analyzed  the  behavior  of  the  matrices  {Q^'^^}  because  one  can  view  Method 
III  esssentially  as  a  procedure  for  generating  these  matrices  (  the  computation  of  {c"} 
being  only  an  intermediate  step  ).  Moreover,  we  will  now  show  that  the  matrices  Q^'^'^  of 
Method  in  converge  to  a  limit,  which  is  not  the  case  for  Methods  I  and  11. 

Corollary  3.2.  Suppose  that  \\E''^°^\\  <  e,  wiiere  e  is  given  by  Theorem  3.3.  Then  the 
matrices  {Q^'^^}  generated  by  Method  III  converge  quadratically  to  a  limit  Q* . 

PROOF:   Note  that 

Q  -  Q  =  Q{{I  +  ^Y){I  -  ^Y)-'  -  I) 

=  Q{Y  +  0{\\Y\\^))  (3.66) 

-^0{\\E\\), 

where  the  last  step  follows  from  (3.64).  By  Theorem  3.3  {Q^"'}  is  a  Cauchy  sequence 
and  therefore  has  a  limit  Q'  =  [Ql  Qjl-  (Observe  that  Q^  is  equal  to  Po-)  Moreover,  by 
the  quadratic  convergence  of  {Hi^^"^!!},  and  (3.66),  we  have 


Q-Q'=   E  {Q^'^  -  Q^'^'^) 

/      ,N  (3.67) 

=  0{\\E^''^\\-). 

E^  =  {I-Il)Qr 

=  (Qi  -  Q[)  +  [{Ql  -  Qx)  +  (Qi  -  ^1)  +  {Pi  -  ngo] . 
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Also 


Using  (3.67),  (3.65),  (3.19)  and  (3.23)  we  have 

Ei^{Q,-Ql)  +  0{\\Ef).  (3.68) 

Since  Q2  =  P2  we  also  have 

E2  =  Q2-  Ql-  (3.69) 

Combining  (3.68)  and  (3.69) 

E=[{Q,-Q\)     {Q,  -  QD]  +  0{\\E\\') 
=  iQ-Q')  +  0{\\Ef). 
Therefore 

E  =  0{\\Q-Q'\\), 

which  together  with  (3.67)  completes  the  proof. 

Now  we  show  how  to  adapt  the  proofs  so  that  they  apply  to  the  modified  algorithms 
designed  for  solving  Problem  2.  We  are  given  only  the  n  —  s  smallest  eigenvalues 

XI  =  ...  =  a;<  A,Vi  <  •••<a;_,.  (3.70) 

We  replace  Assumption  3.1  by 

ASSUMPTION    3.2: 

(i)   There  exists  c*  such  that  A{c')  has  eigenvalues  given  by  (3.70) 

(ii)   The  matrix  K{c'),  defined  by  (3.5)-(3.7)  using  any  orthonormal  set  of  eigenvectors  of 
A{c'),  is  nonsingular. 

Theorem    3.4.    There  exists  e  >  0  such  that,  if    [|c°  -  c*!|   <  e,  the  iterates  {c"^}  of 
ModiSed  Method  I  converge  quadratically  to  c' . 

PROOF:    It  follows  the  proof  of  Theorem  3.1  very  closely.  The  only  difference  is  that  the 
new  iterate  c  is  defined  by  (3.5)-(3.7),  where  q'^  refers  to  the  computed  eigenvector  g,(c''). 
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Using  the  first  n  —  s  diagonal  equations  in  (3.25)  plus  the  equations  corresponding  to 
I  <  i  <  j  <  t  we  obtain 

K{c)i-c-c')  =  0{\\X\\'). 

The  rest  of  the  proof  follows  as  before. 

Theorem  3.5.  There  exists  e  >  0  such  that,  if  ||c°  -  c'\\  <  e  the  iterates  {c"}  of 
ModiSed  Method  II  converge  quadratically  to  c* . 

PROOF:  Again,  this  follows  the  proof  of  Theorem  3.2  very  closely,  differing  just  as  de- 
scribed for  Theorem  3.4.  Now  the  q'^  refer  to  the  vectors  updated  by  inverse  iteration. 

Theorem  3.6.  There  exists  e  >  0  such  that,  if  i|£'(°^||  <  e,  the  norms  of  the  error 
matrices  {||£'(''^||}  of  Modi&ed  Method  III  converge  quadratically  to  zero. 

PROOF:  As  in  the  proofs  of  the  two  previous  theorems,  replacing  the  marix  J  hy  K  allows 
us  to  obtain 

c-c'=0{\\E\\'). 
For  the  second  part  we  need  to  consider  the  unspecified  eigenvalues.  From  (3.25)  and  (3.8) 

A,  -  A,'  =  qJ'Aic  -  c')qi  +  0(||i;|i-)  n  -  s  <  i  <  n. 

The  off"-diagonal  equations  not  included  in  K  give 

^i;  -  y.j  =  ^ YlJ^i^^'  -  c)9;  +  0[\\Ef)         i  <  j,     J  >  t. 

Ay  —  A, 

It  follows  that  Xij  —  y,j  =  0(||£'||"),  provided  the  unspecified  eigenvalues  at  the  solution 
are  each  distinct  from  all  other  eigenvalues.  As  explained  before,  this  last  condition  can  be 
removed  by  introducing  a  tolerance  parameter  Neglig  (see  step  3  of  Modifed  Method  III). 
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§4.    NUMERICAL  RESULTS 

We  have  tested  Methods  I,  11  III  and  FV'  on  various  types  of  problems.  In  our  experi- 
ence, Method  IV  almost  always  requires  more  iterations  for  convergence  than  the  other 
methods,  and  also  encounters  difficulties  more  often.  On  the  other  hand,  the  local  be- 
haviour of  Methods  I,  II  and  III  is  very  similar,  as  is  illustrated  by  the  three  examples 
we  present  below.  The  tests  were  made  on  VAX  ll/780s,  at  the  Courant  Mathematics 
ajid  Computing  Laboratory  and  at  Northwestern  University.  Double  precision  arithmetic 
WcLS  used  throughout,  i.e.  approximately  14  decimal  digits  of  accuracy.  The  eigenvalues 
and  eigenvectors  were  computed  using  the  EISPACK  subroutines;  see  Smith  et  al  (1967). 
The  stcLTting  points  were  chosen  close  to  the  solution,  so  that  few  iterations  were  required 
for  convergence.  A  line  search  (or  a  trust  region  strategy)  would  be  essential  to  make 
the  algorithms  convergent  in  practical  applications.  However,  we  have  not  included  these 
features  and  have  concentrated  on  the  local  behavior  of  the  methods.  In  particular,  we 
were  interested  in  verifying  that  quadratic  convergence  takes  place  in  practice,  in  both  the 
distinct  and  the  multiple  eigenvalue  cases. 

We  programmed  the  Modified  Methods  I,  II  and  III  as  described  in  §  3.  The  iterations 
were  stopped  when  the  residual,  defined  in  Step  1  of  each  method,  was  less  than  10~^.  The 
parameter  Neglig  required  by  Method  HI  was  set  to  10~^^.  For  convenience,  all  vectors 
will  be  written  as  row-vectors.  When  specifying  a  symmetric  matrix  we  will  only  write  its 
lower  triangular  part. 

Example  1         This  is  an  additive  inverse  problem  with  distinct  eigenvalues.  Here  n  =  8, 


Ac 


0 

4 

0 

1 

-  1 

0 

1 

2 

3 

0 

1 

1 

1 

1 

0 

5 

4 

3 

2 

1 

0 

1 

-  1 

-  1 

-  1 

-  1 

-  1 

0 

1 

2 

3 

4 

5 

6 

7 

Ak  = 


T 


k=l, 
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A*  =(10,20,30,40,50,60,70,80) 
c°  =(10,  20,  30, 40, 50, 60,  70, 80) 
c*  =(11.90788, 19.70552, 30.54550, 40.06266, 
51.58714,64.70213,70.17068,71.31850). 
The  following  table  displays  the  values  of  the  residual,  for  each  method. 


Iteration 

Method  I 

Method  II 

Method  III 

0 

6.40i;  -f  00 

6.40i;  +  00 

6.40i;  ~  00 

1 

8.93i;  -  01 

1.51i;  +  00 

1.23i;  +  00 

2 

1.03E  -  01 

9.74i;  -  02 

1.45^;- 01 

3 

2.72E  -  03 

1.97jE;  -  03 

3.48£;  -  03 

4 

2.32^;- 06 

1.14.2;  -  06 

2.58E  -  06 

5 

1.69E  -  12 

4.04^;  -  13 

1.50i;-  12 

Example  2         We  define  a  problem  with  multiple  eigenvalues  and  n  =  8.  First  consider 
the  8x5  matrix 


V  = 


—  1 

-3 

-5 

-6 

—    ^ 

-5 

-  17 

—  1 

-  1 

5 

18 

1 

2 

0 

—  1 

2 

0 

1 

3 

0 

-  1 

2.5 

.2 

.3 

.5 

.6 

2 

_  .2 

.3 

.5 

.8 

and  define  B  =  I  +  VV'^ .   Now  define  the  matrices  {A^}  from  B  zs,  follows:   let  .4o 
and  for  k  =  1, . . . ,  n 


=  0 


fc-i 


Afc  =  ^  bkj{ekej  +  ejel)  +  bkkSkel. 
;  =  i 

Now  consider  c  =  (1, 1, 1, 1, 1, 1, 1, 1);  by  construction, 


A{c)  =  B  ^  I  +  VV'^. 
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It  follows  that  1  is  a  multiple  eigenvalue  of  A{c)  with  multiplicity  3.  In  fact,  the  eigenvalues 
of  .4(c)  are 

(1, 1, 1,  2.12075, 9.21887, 17.2814, 35.7082,  722.681) . 

Now  let  us  choose  the  target  eigenvalues  A*.  A  suitable  choice  is 

A-  =  (1,1,1,2.1,9.0) 

i.e.  specifying  one  eigenvalue  of  multiplicity  3  and  2  distinct  eigenvalues.  Then  i  =  3,  s  =  3 
cind  n  —  s  =  o,  and  the  dimensions  are  properly  chosen  for  the  formulation  of  Problem  2. 
We  could  use  c"  as  a  starting  point  for  the  methods,  but  instead  we  choose 

c°  =  (.99,  .99,  .99,  .99,  l.Oi,  1.01, 1.01, 1.01). 
The  locally  unique  solution  found  by  all  three  modified  methods  is 

c'  =(.9833610,  .9743705,  .9753132, 1.054523, 

.8554860,  .9117770,  .9283310,  .8880013). 
The  following  table  displays  the  residual  for  the  three  methods. 


Iteration 

Method  I 

Method  II 

Method  III 

0 

2.09^  -  01 

2.09i;  -  01 

2mE  -  01 

1 

1.92i;  -  01 

2.26^;  -  01 

2.79E-01 

2 

2.04i;  -  01 

1.54i;-01 

1.99^  -  02 

3 

3.23E  -  02 

2.03^  -  02 

1.26^;  -  02 

4 

7.11i;  -  03 

2.45E-03 

2.67E  -  04 

5 

1.44^;  -  04 

2.19j5;-05 

3.18^-07 

6 

7.89E  -  08 

1.85i;-09 

S.ooE  -  13 

7 

3.66^  -  14 

Example  S         This  is  an  additive  inverse  problem  with  multiple  eigenvalues.  Here  n  =  6, 


Ao  = 


0 

6.3 

0 

-  1 

-3.7 

0 

-2 

-6 

.3 

0 

1 

3 

-  1 

-2.7 

6 

12 

-  4 

4.0 

0 
1.3     0 


T 


k  =  1,...,6 
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A-  =(0,0,0) 

c°  =(3,14,3,14,1,18)  •-.  .-.•«^^^!  ^'-' 

c*  =(3.308477, 14.17183, 2.225671, 13.54877,  .9512727, 17.67949). 

Note  that  the  problem  is  well  posed  by  specifying  only  one  eigenvalue  of  multiplicity  three. 
The  residual  values  are  given  below. 

Iteration  Method  I  Method  II  Method  III 

0  2.47i;-01  2.47^;- 01  2.47^-01 

1  1.50^-01  1.48i;-01  1.47i;-01 

2  1A3E  -  02  2.29E  -  02  2.58£'  -  02 

3  2.89^;- 04  5.71^;- 04  6.58.E;  -  04 

4  9.63E  -  08  3.76E  -  07  4.97.2;  -  07 

5  1.22^;- 14  1.86.&-13  3.2l£;  -  13 

These  examples,  and  our  overall  numerical  experience  with  the  three  methods,  indicate 
that  quadratic  convergence  indeed  occurs  in  practice,  and  that  the  three  methods  have 
very  similar  local  behavior. 
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